Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't want to step on these guys' PR, but I do have a similar personal project for anybody who is interested, http://newspaper23.com

Initially it's just an aggregator that presents commentary in plain text. I plan on adding a summarizer one day. For a personal project, I've been using it daily for over a year, so I know I find a lot of value in this type of thing.

As sites try to get more sticky, the signal-to-noise ratio decreases. You spend more time reading a lot of trivial articles that a Facebook friend recommended instead of a few articles that you've scanned yourself. I know Google and FB say social search is the cool thing, but in my experience the only thing it does is increase consumption of mediocre shiny stuff. Much better to pre-qualify sources and then control the depth of your dive. For newspaper23, one of the original ideas was a timer for each day. 30 minutes of scanning and the site would refuse to load until the next day.

I'd like to see more of this type of thing -- gearing content consumption to humans instead of site creators and advertisers.



Looks like there are a few of us on HN trying to solve the news consumption problem. My site[1] tries to crowdsource the summaries by encouraging the readers to summarize a story themselves. In the meantime, we create most of the summaries in-house.

Newser[2] tried to do the same thing but gave up and just focused on in-house created summaries.

Regarding the summary process, it's good to see another group manually summarizing posts as opposed to Summly[3] which tries automate the process. For many reasons already stated in another comment[4], I don't think automation will ever work.

Good luck to the guys over at http://toolong-didntread.com and http://newspaper23.com. I hope one of us gets some real traction!

[1] http://skimthat.com [2] http://www.newser.com/ [3] http://summly.com/ [4] http://news.ycombinator.com/item?id=4741855

Edit: Another product that tries to summarize news http://cir.ca/


I'll add my baby to the pot: http://skimfeed.com.


This is actually really nice, well done. Hard to drop the frills for thrills so to speak. This is a much more functional site than most of the similar bootstrappy approaches, despite it looking like it came out of the 90s. It's actually useful, I like.


Awesome site man. I'm learning PHP so that I can make my own personalized news site like that. Can you give a short summary of how you built it? Or what technologies I'd need to use to create my own news aggregator?


That looks really good. I am not familiar with many of the sites on there, but I noticed no way of getting to HN discussions from your site. Do you think some way of reaching the discussion or comments section of some of the feeds would be helpful to others?


Made me think of Jimmyr that I used several years ago... http://www.jimmyr.com/ I really like yours. Time to upgrade, thanks!


I like it! Simple and I can actually see myself using this.


Inspired by AffBuzz?


popurls on courier new?


I really like your site, skimthat.com. However, I think the summaries should be condensed even more. For example, this summary is 203 words:

http://skimthat.com/4407/ind-home-explosion-now-homicide-inv...

It could have been summarized down to something like: Rumors are circulating that a suspicious white van was parked in front of the house that blew up.

I would rather it be super short with just the important fact. If it peaks my curiosity, then I'll click to the full article.

I don't know, that's just my opinion.


Thanks for your feedback. I see what you're saying about a shorter summary but I'd consider your sentence more of a tease than a summary.

Our goal is to give the reader a good understanding of the topic while leaving out the unnecessary details and redundant information. The original had 465 words. That means that the Skim That summary cut the story down by over 50% while still giving you a strong overview of the story.


I see. If that's the case, then I think there is room for a "teaser" news site that gives you just a bit more than the headline. If you want more details, you can click on a more detailed summary, or if you want to read the whole thing, then you can click to the full article.


adding http://tldr.it as a generic approach.


Another good example of an automated attempt to summarize the news that'll never work.

Here's why: Let's assume we have the perfect algorithm that knows exactly which sentences to pick for a good summary. You'll still end up with a summary that's horribly out of context and difficult to read.

Try it yourself. Assume your brain is the perfect algorithm and pick the most important sentences for a summary. Then try to read just the sentences that you picked without rewriting it into a coherent paragraph. More than likely, it'll be an out of context confusing block of text.

Algorithms will never solve the summarizing process unless they are teamed up with a rewriting engine that could build coherent paragraphs.


> Another good example of an automated attempt to summarize the news that'll never work.

> Algorithms will never solve the summarizing process unless they are teamed up with a rewriting engine that could build coherent paragraphs.

That doesn't sound like never. Frankly, I wouldn't be surprised if such AI would created within 50-100 years, ie in my lifetime. I think this comment exposes the common false thought-pattern that if something is not viable in the next quarter, then it'll never happen.


No, you are misreading my statement. Please go back and look for the word "unless". Sentence picking algorithms alone will never solve the problem unless a rewriting engine is added.


Sorry, you are right; I misread the first sentence. I didn't realize that you referred to tldr.io specifically with it.


Probable mistype, you mean tldr.it right ? Just saying that as a cofounder of tldr.io, whose goal is similar to that of SkimThat: provide human written summaries ;)


What steps are you taking to mitigate spam and bias in summaries created by people?

I know if I wrote a summary of a Microsoft Surface article (I have one), it'd be very different from someone who loves it -- or is even an MS employee.

And spam... big issue there.

(disclaimer - I have your plugin and have tried it, summaries seem way too short.)


For anyone working on the summary problem: Chapter 8 of O'Reilly's Mining the Social Web has a great description of how you can use Python NLP libraries to accurately break text into individual sentences, then analyze those sentences to pull out the most important. I don't know how well it works in practice, but their example is amazing.


The Sentence segmenter doesn't work that well, and rooting words doesn't work as advertised. Try Children, Families, Family, and complies.


Here's mine, http://textteaser.com/. It's basically an API that accepts the URL as an input and outputs a JSON result. It's only a preview so expect more to come.


I'm having trouble getting http://m.yahoo.com/w/legobpengine/news/blogs/clinton-white-h... to work.

You should have a few links listed on that page that you know will work well as examples for people to try the service out with.

Btw, how do you feel about my other comments in this thread about algorithms not working well? My basic point is that they will most likely produce paragraphs that are out of context.


>My basic point is that they will most likely produce paragraphs that are out of context.

Disagree - having written a bunch of these gist extractors I have found that the good ones do not produce out of context paragraphs. In fact, that's pretty much the point - to find the salient portion of the content.


My argument is that many times the salient portion is written in such a way that it's continuing a point from a previous part of the article. So, when you remove it from the complete context of the article, the sentence seems like it's out of place even if it has an important detail.


Mine work well with that link. Thank you for your suggestions but the website I created is just a preview. Will work on it though. I'll keep you posted when I made some changes.

They are out of context because they (we) are basically just doing extraction. In which we just extract the most important sentences, arrange them in order, and present it in a paragraph / list. I prefer presenting the top sentences in a list though. In my opinion, presenting it as a paragraph will make an effect that it's out of context rather than in a list.

If the summaries was done through abstraction, like how humans are doing it, it will obviously produce better summaries. But why we are not doing it? I believe abstraction summaries are holy grail of automatic summarization.


you forgot http://tldr.io


Thanks! I didn't know about that site before and it looks like they have many contributors. I considered doing a bookmarklet like they do but I figured that I would need a critical number of people before it could be realistic to use.

In the meantime, I created a list[1] of popular stories from Reddit news sources. Each story links to a special hybrid page that allows you to write a summary on top while you read the story below.

[1] http://skimthat.com/unsummarized

edit: clarification


I read the tldr.io of this TL;DR service first.


Do you mean that you tried the tl;dr bookmarket on Skim That? I just added a summary for Skim That using the bookmarket so you should see it now.

I really like how it uses line by line bullet point fields to help write the summary. I experimented with something like that in the past. It kind of works but it isn't as smooth a read as a paragraph of text but it does make the writing process easier.


You confused me there, but I see now that it was actually me who confused you as I meant to reply to czzarr instead...

It seems that I helped inspire you nonetheless, which is great of course!

As for what makes tldr.io great; it's mostly the Chrome plugin which adds TL;DR icons next to the links on Hacker News.

They still need to work on enticing more users to contribute back though.


When did Hacker News become "Spam my Stuff" news?

Isn't it possible to discuss the merits of the product provided in the opening post without one-line nonsense posts spamming your own thing?

JEEZUZ.


Your site is awesome! I've been looking for a way to read without distractions and without having to enter every single article into instapaper.

What many people underestimate is that summaries have an inherent problem: bias. No two people are ever going to summarize an article the exact same way - much less so when it comes to politics etc. So before I use a site to read summaries, I have to trust the brand (I trust The Economist, for they mostly differentiate between reporting and editorializing) and I will never ever trust an anonymous bunch of people (the "crowdsourcing" summaries-solution) to accurately summarize without bias. I value as-close-to-objective-reporting-as-possible very highly - judgments I can make my own.


Objective reporting, if it ever existed, is a quaint artifact of history as far as I can tell. Even if explicit opinion or analysis is excluded, the mere selection of which facts to include and exclude is subject to bias of the writer/editors. The only way I can see to get the unbiased facts out of news reporting is to consume a variety of sources.


Once again, not to step on TL;DR, but since most commenting here are saying that curation is not scalable or sustainable, I wanted to present it in this way:

Phase 1: The MVP of a news aggregator should be a curated/editorial like traditional media/magazines, etc [1]

Phase 2: The second stage should be to have submissions from your readers (on content that interests them).

Phase 3: The next would be to have a voting mechanism, and then realizing that articles have different relevance in different communities - which is what prismatic and ypander were aiming to solve.

Full Disclosure: I am working on summarizing technology news, and the tagline so far has been "Hacker News on Steroids." Before it gets to that stage, I'm skimming my most favourite feeds from Google Reader. The best mobile wrapper I've found (since I'm targeting mobile as well) has been Feedly, which presents it as more like a magazine. Keep in mind Feedly didn't start out as this 4 years ago.

Another project aiming to solve this problem with a bookmarklet (and has a more appt domain name imo is tldr.io) though I personally haven't used it. Summly is another one, but after using it for a while, I found Feedly much better suited to my reading habits.

I'd love to team up with others who are working on this problem. My MVP is at http://dinopost.com. Drop me a line at aaron at dinopost.

[1] Phases based on Casey Accidental's blog post: Online News is Broken: http://caseyaccidental.com/online-news-is-broken/


hey there, did you know that tldr.io now offers an extension that allows you to read tldrs without leaving the HN frontpage? https://chrome.google.com/webstore/detail/tldr/ohmamcbkcmfal...

full disclosure: i'm one of the cofounders


For what it's worth, the OP's site is much easier to read or skim.

I'm guessing it has to do with column width and coloring.


Newspaper23, I have a very hard time adjusting my eyes to that font/typography combo. I simply can't read the titles fast enough to return to the site. Reddit and HackerNews are easier on the eyes (same with OP's link).


Very nice too. But some advice: the titles are way too big and look unprofessional, and the front page needs summaries. Other than that I like it.


Thanks. I threw it together using the first version of Bootstrap if I recall. I think the titles are actually buttons. Also the UI is buggy. I have to use the back buttons to get back to the main page. Many times it's easier just to reload the site to get to the first page.

But it scales nicely out to a tablet. And a phone. Part of the initial appeal for my writing it was to download all of the daily commentary as json to my tablet. Then I could walk around town and such without having to worry about an internet connection. The idea was -- how much real static text do you consume in a day? Can't be more than 100 or so articles. So why not just download the plain text and consume it at your leisure?

Another feature I wanted was the ability for people peering over my shoulder to NOT be able to tell what I was consuming. Many times at work or on the train I'd have some time to read commentary, and the last thing I wanted to do was load up a page with a branded look and feel. I wanted the plainest amount of pure text possible.

I appreciate the feedback. Although I use it daily, it's not high on my list of priorities. Next up I think I'll double the amount and type of content. Maybe after that I'll go back to the UI for some rework. For me this is more of a personal thing than a business feeler. So even if nobody else in the world likes it, I'm good. :)


I do want to step on their PR. This is like looking at an RSS feed. Try something like TLDRPlugin.com that actually summarizes content, and works with any web page, not just the few RSS feeds that some guy decided to include.


Well, here's mine: http://www.hackerreads.com/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: