Hacker Newsnew | past | comments | ask | show | jobs | submit | DavidFerris's commentslogin

There's a pretty big bias for mechanical engineering components in the dataset- very few organic forms. It's one of the limitations we call out in the dataset card.

There are a few though! Try "dog" or "cookie cutter" for example.


It's CAD. Is there a legitimate reason to use that to engineer a dog? Doesn't make sense to me.

A dog form is highly useful for a robot.

Even purely artistic uses need parametric models of organic things though. Games, other non-game modelling, 3d printing.

Both games and general modelling needs parametric versions of basically everything in the model. If you're trying to design and evaluate some cattle processing facility, you'll want a lot of randomly varied cow models.

But I bet the biggest use is games and movies. You don't model every dog from scratch, you take the parametric dog model and move the sliders to get all the different dogs in your commercial or show or game.


The initial ABC dataset is from public Onshape files -- clearly some people had a reason to design a dog model parametrically!

id guess for 3d printing.

not the best tool for the job of making something organically shaped, but maybe they also wanted to run some aerodynamics tests on it?


This isn't meant to be a commercially useful search engine- just a demonstration. You'll only be able to search for terms that the VLM could directly discern.

From the blog post: Our search demo proves that it works quite well. As anticipated, text search works well, returning sensible results for even irregular or poorly formed queries. It’s worth mentioning that this is very different from 3D part libraries like Thingiverse or GrabCAD. Search in those repositories requires users to tag or annotate parts with a description, the text of which is used in search. Our system takes only an unnamed part as input, requiring no additional labelling.


I see, you did an AI demo of captioning and search over captures specifically for complex geometric objects.

I guess my interest was more piqued by the "CAD" part.


We rendered the one million part ABC dataset from Deep Geometry, and open-sourced the data. We also built a fun demo with the following pipeline: CAD > render > caption > embed.

Open-sourced dataset: https://huggingface.co/datasets/daveferbear/3d-model-images-...

Blog writeup: https://www.finalrev.com/blog/embedding-one-million-3d-model...


The search function doesn’t seem to work at all, it provides nonsensical results.

For example if I search “supercolumns” I get regular household furniture.


Yeah I think the embedding are describing what can be seen from a picture of the model not what it is or what it is used for. search some things work like "Fan" but others don't so you can search for "plate with 5 holes" but not for specific engine part cover.

When I search for "chair" I get 48 results, about 3 of which are actually chair-like.

- Some are clearly miscategorized - ABC-00131096 is a coffee table but has a very detailed description of its chair attributes.

- Many others are weird nonsense geometry, like ABC-00991744, ABC-00807798, ABC-00349255 or ABC-00822766.

- Some have a partially accurate description (if you pretend it's a chair), like ABC-00685912 has a blocky geometric structure with a horizontal piece off a vertical piece, but then it starts talking about an armrest on one side that doesn't exist at all.

- ABC-00388826 is a silhouette of a cat, which the description misses completely, and I don't see how you would sit on this "unique chair design characterized by its fluid and sculptural form."

Overall the descriptions are pretty useless and ascribe a lot of chairness to things that are not chairs.

Is a dataset with this much junk in it good for something?


Super cool! btw I love the name "Greptile" :)


Thank you!


Inspired by Ian Hickon's reflection on his 18 years at Google, I reflect on how Google got into their current slump- and what they can do to bounce back.


Interesting idea! One of the problems with any primary research (surveys included) is the delay in collecting responses, which can take hours to weeks depending on sample, IR, incentives, etc. This would solve that!

It's not surprising that LLMs can predict the answers to survey questions, but really good primary research generates surprising insights that are outside of existing distributions. Have you found that businesses trust your results? I have found that most businesses don't trust survey research much at all, and this seems like it might be even less reliable.

-----

Context: I co-founded & sold survey software company (YC W20).


Thank you!

Trust is one of the biggest issues we're trying to solve. This motivated the tSNE plots and similarity scores under 'Investigate Results', but we definitely have a long way to go. Generally speaking, survey practitioners trust us more than their clients (perhaps not surprising)


You might want to take a look at the papers i've linked here that go into this kind of research

https://news.ycombinator.com/item?id=36868552


Very cool! I constantly struggle trying to do things in spreadsheets that are easy in Python. But I/O makes it annoying to write one-off scripts for a 30 second op. This would solve that pain point!

I would love a Google Sheets integration, since that's where I already live with most of my CSV/Sheets data + it would seamlessly fit into my workflow. If this was a Chrome extension I would have installed it today.

As is, I don't see myself using another spreadsheet app.


Switching tools is hard, that's something we understand. You can import sheets fairly easily of course into Neptyne. Would a two way sync (where a certain region of a Neptyne document and a certain region of a Google Sheet would automatically be kept in sync) make you have another look?


I wrote about a bad thing that happened to me recently and what the web3 people can learn from it.


> "However, the algorithm is only valuable for detecting scams accurately after they have been executed."

Forgive me, but isn't it only useful to detect scams before they happen?


It’s more useful, sure. But it’s certainly still useful to know historical information.


I've been using Superpowered for the past 2 weeks and can honestly say the experience is great. I don't live in the browser and the Mac calendar app is pretty awful, so this has been a definitely workflow improvement for me. I probably click on it 10-20x per day.

That said, when my trial expired today I elected to not upgrade in no small part to the insane memory usage.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: