I gave it a series of 11 images stripped of all metadata. It performed quite wel...

thomasfromcdnjs · 2025-04-18T02:45:17 1744944317

I uploaded this image that I screenshotted off Google street view (no metadata) and it got with 200m.

https://chatgpt.com/share/6801bbf7-fd40-8008-985d-75c8813f55...

There is the chat.

Weirdly it said, "I’ve seen that exact house before on Google Street View when exploring Cairns neighborhoods."

geysersam · 2025-04-18T05:57:56 1744955876

> Weirdly it said, "I’ve seen that exact house before on Google Street View when exploring Cairns neighborhoods."

That's slightly creepy!

oezi · 2025-04-18T06:22:44 1744957364

The anthropomorphisation certainly is weird. But the technical aspect seems even weirder. Did OpenAI really build dedicated tools to have their models train on Google Street View? Or do they have generic technology for browsing complex sites like Street view?

comex · 2025-04-18T07:32:16 1744961536

It’s just a hallucination, same idea as o3 claiming that it uses its laptop to mine Bitcoin:

https://transluce.org/investigating-o3-truthfulness

I doubt the model was trained on Street View, but even if it was, LLMs don’t retain any “memory” of how/when they were trained, so any element of truthfulness would be coincidental.

geysersam · 2025-04-18T08:11:03 1744963863

If it's trained on street view data it's not unlikely that the model can associate a particular piece of context to street view. For example, a picture can have telltale signs that street view content has, such as blurred faces and street signs, watermarks, etc.

Even if it's not directly trained on street view data it has probably encountered street view content in it's training dataset.

namaria · 2025-04-19T09:56:11 1745056571

The training process doesn't preserve information needed for the LLM to infer that. It cannot be anything other than nonsense that sounds plausible, which is what they do best.

oezi · 2025-04-18T08:40:44 1744965644

I think the test which the OP performed (to pick a random street view and let it pinpoint it) would indicate that it has ingested some kind of information in this regard in a structured manner.

casey2 · 2025-04-18T06:58:28 1744959508

They should definitely add that feature.

Tell it your name and then it just looks you up and street views your house, and puts that all into memory.

bluesnews · 2025-04-18T03:01:31 1744945291

It might train off of street view

thomasfromcdnjs · 2025-04-18T04:55:16 1744952116

This was the image: https://imgur.com/a/cCUvgDG

marxisttemp · 2025-04-18T03:02:05 1744945325

This is the most impressive ChatGPT chat I’ve seen yet. While I theoretically can accept how large-scale probabilistic text generation can lead to this chain of “reasoning”, it really feels like actual intelligence.

HaZeust · 2025-04-18T04:58:55 1744952335

It's been intelligence for a long time; the goalposts just shift, and people can't abstract the idea to an LLM. But language processing and large data processing itself IS a form of intelligence.

PhilipRoman · 2025-04-18T06:23:28 1744957408

Maybe you're right, but I think it's more likely that it had been trained on street view photos and then invented a plausible justification for the guess afterwards (which is something I often see ChatGPT do, when it easily arrives at the correct answer, but gives bullshit explanations for it).

CSMastermind · 2025-04-18T02:58:29 1744945109

I played a round of Geoguessr against it and while it did a shockingly good job compared to what I was expecting, it still lags behind even novice human players.

The locations and its guesses were:

Bliss, Idaho - Burns, Oregon (273 miles away)

Quilleco, Biobio, Chile - Eugene, Oregon (6,411 miles away)

Dettighofen, Switzerland - Mühldorf, Germany (228 miles away)

Pretoria, South Africa - Johannesburg, South Africa (36 miles away)

Rockhampton, Australia - Gold Coast, Australia (437 miles away)

CSMastermind · 2025-04-18T04:11:29 1744949489

Okay, I decided to benchmark a bunch of AI models with geoguessr. One round each on diverse world, here's how they did out of 25,000:

Claude 3.7 Sonnet: 22,759

Qwen2.5-Max: 22,666

o3-mini-high: 22,159

Gemini 2.5 Pro: 18,479

Llama 4 Maverick: 14,316

mistral-large-latest: 10,405

Grok 3: 5,218

Deepseek R1: 0

command-a-03-2025: 0

Nova Pro: 0

nemo1618 · 2025-04-18T06:31:44 1744957904

Neat, thanks for doing this!

msephton · 2025-04-18T07:29:53 1744961393

How does Google Lens compare?

CSMastermind · 2025-04-18T11:04:08 1744974248

I tried it but as far as I can tell Google Lens doesn't give you a location - it just describes generally what you're looking at.

msephton · 2025-04-20T23:51:03 1745193063

I had cause to try Google Lens today and found the location to exact address thanks to a veterinary clinic which was in the background of an image. ChatGPT got the country but wrong city.

arresin · 2025-04-18T08:28:42 1744964922

What about 04-mini-high ?

CSMastermind · 2025-04-18T10:58:20 1744973900

OpenAI's naming confuses me but I ran o4-mini-2025-04-16 through a game and it got 23,885

arresin · 2025-04-18T22:12:53 1745014373

Interesting. It supports what they said (this is the model with good visual reasoning)

jen729w · 2025-04-18T07:41:46 1744962106

I just took a picture from my own front porch of the street and the houses opposite. It said 'probably Australia but I'd need more info'.

I said, give me your best guess.

And it guessed Canberra, Australia. Where I'm sitting right now drinking a Martini. Pretty spectacular.

delusional · 2025-04-18T05:51:16 1744955476

I gave It some photos from denmark, didn't even bother to strip the metadata. One is correctly said give of "Scandinavian vibes" every other photo was very wrong. I also gave it a photo of the french Alps, it guessed Switzerland.

Measter · 2025-04-18T13:11:38 1744981898

I gave o4-mini-high a cropped version of a photo I found on Facebook[0][1], and it quickly determined that this was in the UK from the road markings. It also decided that it was from a coastal city because it could see water on the horizon, which is the correct conclusion from incorrect data. There is no water, I think that's trees on a hill. It focused heavily on the spherical structure, which makes sense because it's distinctive, though it had a hard time placing it. It also decided that the building on the left was probably a shopping centre.

It eventually decided that the photo was taken outside the Scottish Exhibition and Conference Centre in Glasgow. It actually generally considered Scottish locations more than others.

The picture was actually taken in Plymouth (so pretty much as far from Scotland as you can get in Britain), on Charles Street looking south-east[2]. The building on the right is Drake Circus, and the one on the left is the Arts University. It actually did consider Plymouth, but decided it didn't match.

[0] This image with the "university plymouth" on the left cropped out, just to make it harder: https://www.facebook.com/photo/?fbid=9719044988151697&set=gm...

[1] https://chatgpt.com/share/68024c91-61d0-800c-99b1-fcecf0bfe8...

[2] https://maps.app.goo.gl/3TXv2UxH5128xQjJ9