Main Menu

Attention web-fu gurus!

Started by Noisybast, 15 September, 2006, 02:53:41 AM

Previous topic - Next topic

Noisybast

OK, not really a web design question per se, but it's a technical issue that's beginning to get on my nerves, as I kind of half know what's up ( sort of, ish).

As some of you know, I work as a lowly tech support chimp for a second-rate ADSL ISP (which will remain nameless). Since the company made some changes in a whole bunch of exchanges, lots of customers have been complaining that they can't access certain sites.

It's not a secure sites issue (although some of the affected sites are secure).

It seems to be linked to specific types of URL (and this is where my web-fu knowledge fails me). Consider the following examples:

http://www.mac.com/WebObjects/Webmail.woa?aff=consumer&cty=US&lang=en

http://www.amazon.co.uk/music-rock-classical-pop-jazz/b/ref=topnav__w_h_/026-0684999-9697213?ie=UTF8&node=229816

http://login.live.com/login.srf?id=2&svc=mail&cbid=24325&msppjph=1&tw=0&fs=1&fsa=1&fsat=1296000&lc=2057&_lang=EN

As far as I can tell, the common denominator is the bit at the end of each URL (?something =x&someting else=y).

At the risk of betraying my woeful lack of knowledge in this area, what I need to know is, what exactly does that last set of switches signify?

Going even further out on a limb, if anyone has any knowledge of how these pages are handled by MSANs, that would be most useful.

If anyone can answer by buffonish questions (I'm thinking of Funt Solo and Slippo here), they may well be helping to make a lot of angry people very happy (as well as earning me a fair amount of corporate bullshit brownie points)!
Dan Dare will return for a new adventure soon, Earthlets!

Banners

Is it to do with the ampersands in the URLs? If they're typed direct and not encoded properly in the referring link this can cause problems.

I would guess that your original config somehow fixed this problem so that end-users would carry on happily when they clicked a malformed link, but that the changes you mention somehow obviate this now.

(If I'm right, you'd better let me know the name of your company so I can send an invoice).

M@

Art

As far as I can tell, the common denominator is the bit at the end of each URL (?something =x&someting else=y).

That's called the request string. It's most commonly used to transfer data to the server about the page it;s about to build. I'd be amazed if they could see much of the internet at all if that's what the problem, since pretty much any site using more than the most static HTML uses them (see the URL for this page).

I guess you could check this theory easily enough by going to  http://www.2000adonline.com/, seeing if that page works, then comin to a subpage like this.

thinky

noisy

i agree with art that even though all of the URLs have QueryStrings on them, this in itself wouldn't be the problem

one thing all three examples *do* have in common though that stands out to me is that they have (what i would consider) 'non-standard' file extension types - .woa, .srf and no extension at all

logic dictates that this shouldn't be an issue to a user as the page itself is served by the remote server which will be accustomed to such file types, but it's just an observation...

sorry i can't be any more help

thinky
you think this isn't me? that's so sweet...
//http://www.adverseCamber.co.uk

Art

Hmm, could be the extension, though the Amazon link has no extension at all. I would have thought a blocker would be more interested in MIME type, which in all cases would be HTML.

Art

Have you tried pinging the relevant domains?

Noisybast

"Is it to do with the ampersands in the URLs? If they're typed direct and not encoded properly in the referring link this can cause problems"

I think so, yes. That's my hunch, anyway. It's not that they're being typed in wrong, more that any URL that contains them seems to be inaccessible. I'm at a loss to explain why, though.


"If I'm right, you'd better let me know the name of your company so I can send an invoice"

Heh, you might have to wait a while - they seem to have trouble making basic overtime payments on time, let alone consultancy fees...


"That's called the request string..."

Thanks Art. That's pretty much filled in the blanks for me. I had the gist of what it did, but needed confirmation & clarification. The affected users are pretty sorely limited in what they can view, but I need to get more examples of what people can't access before I can say for sure.


"i agree with art that even though all of the URLs have QueryStrings on them, this in itself wouldn't be the problem "

I know it shouldn't be a problem, but it's the only common theme I can find so far amongst the affected URLs. Weird, innit?


"Have you tried pinging the relevant domains?"

Yeah. The domains seem to be OK, it's just when you get to the specific pages with the long URLs that the trouble starts.


Thanks for the help, people. I'll try to find out more today...
Dan Dare will return for a new adventure soon, Earthlets!

Wake

Could it be connected with this...

"Perhaps the most controversial change in PHP is when the default value for the PHP directive  register_globals went from ON to OFF in PHP 4.2.0. Reliance on this directive was quite common and many people didn't even know it existed and assumed it's just how PHP works. "

The variables set in the URL are called globals and when I uploaded 2000AD Online to the new server I had to manually tell PHP not to ignore these globals to get the site working. A very strict security firewall might not allow globals, because they can present a security risk if the website isn't written properly.

Cheers,

Wake

Funt Solo

:"Since the company made some changes in a whole bunch of exchanges"

I reckon that's the key to the problem, right there.

(Apart from &lang; turning into < on the first url, but that happened on this page - it was &lang; when you pasted it in - so it shouldn't be the & problem, I guess.)

In other words - standard programmer response - it's a hardware problem.

I've read some stuff about various bugs coming in on urls and doing the whole overflow-exploit-thang, so there's an outside chance that a defence against that is somehow to blame - but the first one isn't even that long, so that seems unlikely.

I'll go back to my comfy response:  hardware problem, or a server setting (which counts as a hardware problem).

Land of Logic reports:  the one thing ALL those examples have in common is the "?"
++ A-Z ++  coma ++

Funt Solo

I expect Wake has hit the nail more clearly on the head.
++ A-Z ++  coma ++

Emperor

Could it be a caching issue?

My understanding is that quite a few ISPs often try and cache to speed surfing up but if all the settings have been messed with it might be they are trying to reteive things from the cache that don't exist where they think they do or where they never existed (as those dynamic links might be unlikely to be cached). First thing I'd do is ask the server monkeys to have a prod round the caching system with pointy sticks (don't worry this is standard procedure probably).

You could chek if it is the encoding in the query string with a simple text pages of links people could try.

Other standard questions to ask the user:

1. What browser/platform combinaiton do they use?

2. Can they get in at the main site page and then find the sub page? So if they were looking for the Alan Moore's Future Shocks collection - if the direct link doesn't work does going into Amazon and search for it?

3. Whe they say it doesn't work do they get an error (403 perhaps) or just a blank screen?

4. Is it links they have visited previously that they are returning to (via a link or a bookmark - if so which) or a new page?

That kind of data should help you narrow it down.
if I went 'round saying I was an Emperor just because some moistened bint had lobbed a scimitar at me, they'd put me away!

Fractal Friction | Tumblr | Google+