Forum bug reports

For all your discussion needs regarding Warcraft and the world it resides in.

Topic/Postby Gergel » 01 Jan 2017, 11:11

Yeah, I saw that do. Will do a deep-dive into forum logs soonish.
What kind of sick individual burns a book full of perfectly good dark arts?!
- Darkscryer Raastok
User avatar
Gergel
Gergel Cosmic Smash!
 
Posts: 1778
Location: Estonia

Topic/Postby Shevron » 01 Jan 2017, 11:57

We did get past our 40gb quota. Good thing it was the 31st so it reset after midnight.

Curious how we blew through it though. Never happened before.


Double post merged on 01 Jan 2017 11:53

Gerg .. right now 11:52am, counter stands

Bandwidth
654.52 MB / 39.06 GB ( 2% )

Something is not right. That's way too much data, considering how not busy the forum is.


Double post merged on 01 Jan 2017 11:56

Graphs show that we average about 10-17GB monthly for the past year.

Suddenly this December it's 40GB - on a month that hasn't been particularly busy either.
"Whomsoever takes up this blade shall wield power eternal. Just as the blade rends flesh, so must power scar the spirit."
User avatar
Shevron
Resident Grump
 
Posts: 8199
Location: A cave in Northrend

Topic/Postby Gergel » 01 Jan 2017, 13:36

Looks like a web crawler bot (SEMrush) has taken an unhealthy interest in the forum. I'm going to politely ask all web crawlers to cease and desist, using the "robots.txt" method, and block SEMrush with a bit more aggressive methods.

Will keep an eye on the logs.
LinBeifong_o.gif
LinBeifong_o.gif (235.73 KiB) Viewed 630 times
What kind of sick individual burns a book full of perfectly good dark arts?!
- Darkscryer Raastok
User avatar
Gergel
Gergel Cosmic Smash!
 
Posts: 1778
Location: Estonia

Topic/Postby Toot » 01 Jan 2017, 13:53

Excuse my ignorance, but what does a webcrawler do?
User avatar
Toot
Plushie Superstar
 
Posts: 1781

Topic/Postby Gergel » 01 Jan 2017, 14:17

A webcrawler is how Google and other search engines get all their data.

It's a bot that downloads all the webpages that it can find (such as every thread and post in the forum), and then does something with the data it has gathered. Google for example uses the gathered data to provide search results. So if you try to google rhyme and punishment toot it uses this data to provide a link to Toot's profile in forums.rnp-moonglade.net forum.

Normally the webcrawlers should be reasonably polite and not hammer servers too much. But it looks like this damn SEMrush bot just keeps downloading the entire forum over and over, day after day, causing tens of bloody gigabytes of traffic. So now I've ordered the server to just deny access to it, so that it just gets a tiny error message everywhere, instead of big forum pages.
What kind of sick individual burns a book full of perfectly good dark arts?!
- Darkscryer Raastok
User avatar
Gergel
Gergel Cosmic Smash!
 
Posts: 1778
Location: Estonia

Topic/Postby Toot » 01 Jan 2017, 14:38

Thanks, that's what I'd suspected, but didn't know how Google and suchlike actually gathered their info. :)
User avatar
Toot
Plushie Superstar
 
Posts: 1781

Topic/Postby Dunnykin » 01 Jan 2017, 15:46

Image
Pepple wrote:I haven't because breasts.
2 people like this post.
User avatar
Dunnykin
Enemy of the Tortoise
 
Posts: 797

Topic/Postby Gergel » 01 Jan 2017, 16:55

Now where'd I put my goshdanged EMP...
What kind of sick individual burns a book full of perfectly good dark arts?!
- Darkscryer Raastok
Shevron likes this post.
User avatar
Gergel
Gergel Cosmic Smash!
 
Posts: 1778
Location: Estonia

Topic/Postby Shevron » 01 Jan 2017, 18:03

Here ...

Image
"Whomsoever takes up this blade shall wield power eternal. Just as the blade rends flesh, so must power scar the spirit."
User avatar
Shevron
Resident Grump
 
Posts: 8199
Location: A cave in Northrend

Topic/Postby Tormeron » 04 Jan 2017, 09:58

Sounds like a bugged crawler, it should only be crawling a website every now and then, not daily, and if daily it doesn't need to download the entirety of the forum contents, normally crawlers are only interested in the html wording, cause pictures and videos they can link to and not have to actually download it.

When a google bot enters the forums I have or websites it barely registers any traffic since it keeps only downloading htmls and such.

Also, what the hell does it have to download on the forum in gigabytes? 90% of the pictures are links to outside resources. and vids are all outside resources.
Lilandris wrote:Liandrix' words not mine, but Tormeron is a god apparently. Probably a bit like Loki.

serendipity wrote:Reason: Potato.

Events stories, Torm's events thread Suggestion box
User avatar
Tormeron
The soup master
 
Posts: 3959
Location: In a cookie jar
WWW

PreviousNext

Return to General Discussion.



Who is online

Users browsing this forum: Google [Bot]

cron