hacker news with inline top comments    .. more ..    8 Mar 2017 News
home   ask   best   2 years ago   
Notepad++ V 7.3.3 Fix CIA Hacking Notepad++ Issue notepad-plus-plus.org
211 points by infogulch  1 hour ago   77 comments top 11
tptacek 54 minutes ago 11 replies      
I know it's annoying to hear this, but I'm going to keep saying it: this stuff is silly. The DLL injection stuff in the CIA leaks should embarrass the CIA. If you're calibrating your defenses based on the idea that application programs on Windows and OS X can defend against malware, you're playing to lose.

Here's the rootkits track from Black Hat 2008 --- keep in mind that this is almost a decade old and that it's public work:

* Deeper Door: Exploiting NIC Chipsets

* A New Breed of Rootkit: The System Management Mode Rootkit

* Insane Detection of Insane Rootkits

* Crafting OS X Kernel Rootkits

* Viral Infections on Cisco IOS

* Detecting And Preventing Xen Hypervisor Subversions

* Bluepilling (implementing a hypervisor rootkit) The Xen Hypervisor

This is just one year's work. If you summed all of it together, you're talking ~2.5 FTEs across 7 different research projects which we will very generously assume took a full year to develop (spoiler: no, none of them did). People who can write hypervisor rootkits command a pretty decent salary, but it's not 2x the prevailing SFBA senior salary. So this is at most mid-single-digit millions worth of work.

I don't know why the CIA has this team of people bumbling around with DLL injectors and AV bypasses. Maybe it's some weird turf thing they're doing against NSA? But the stuff in the CIA leaks is not the standard you need to be protecting yourself against.

brilliantcode 59 minutes ago 6 replies      
> Just like knowing the lock is useless for people who are willing to go into my house, I still shut the door and lock it every morning when I leave home. We are in a fking corrupted world, unfortunately.

He fucking nailed it. This is what all of us are feeling. It's almost like, we are collectively being punished by trying to make shit more secure. It makes me dysphoric with hopelessness.

I can no longer proudly claim the West is free. We are living in a surveillance state. It's going to be hard to point fingers at other authoritative regimes around the world using the same tools.

I sometimes feel as if the West is a giant hypocrisy because of the double standards it upholds but relatively better than it's counterparts. Lesser evils.

staunch 10 minutes ago 0 replies      
Does anyone know where I can find the best summary/analysis of the CIA leaks?

Is there someone I can follow on Twitter that's doing great analysis?

agumonkey 4 minutes ago 0 replies      
Kudos for your first OSS contribution CIA.
Traubenfuchs 1 hour ago 3 replies      
NPP still starts on my pc -I am kind of insulted that I am not important enough for the NSA to replace my .dll files.

I mean I know I am not that important but it would have boosted my ego. Way to go NSA. Thanks for nothing.

alephnil 52 minutes ago 1 reply      
If you follow the links to from the release notes to the relevant wikileaks pages, you will find a list that also include chrome, firefox, vlc, opera, libreoffice as well as Kasparsky and McAfee antivirus and many other commonly used software packages, many of them open source.

It is also the first time I have seen evidence that they also target Linux, since they have a hijacked "CMD prompt" on Linux, whatever that means. They may also have targeted Linux with the other software packages of cause.

This is really scary.

svenfaw 1 hour ago 3 replies      
That "fix" is kinda useless. But at least he admits it in the release notes.
f4rker 54 minutes ago 0 replies      
"I can no longer proudly claim the West is free. We are living in a surveillance state."

This has been the case for a very long time.

Nothing is new under the sun - but now lots of people are seeing things for what they are.

thehardsphere 1 hour ago 4 replies      
Are his release notes always this... upbeat?
eps 1 hour ago 2 replies      
I wonder if they just check that WinVerifyTrust() returns OK or if they bother to also check the cert thumbprint.

Because if it's the former, it's trivial to just sign dll with any key and add it to the local trusted store (if needed).

vocatus_gate 1 hour ago 1 reply      
That was fast.
Rabbit hole leads to 700-year-old Knights Templar cave bbc.com
167 points by chris_chan_  1 hour ago   25 comments top 15
roywiggins 28 minutes ago 0 replies      
Info via an archaeologist friend: while these caves are "said to be used by the Knights Templar", they probably date from the late 18th century at the earliest and have nothing to do with the medieval order.



> One suggestion is that they were the result of quarrying during the 19th century, and were then turned by the landowners, the Legge family, into a grotto. It is alternatively speculated that the caverns are older, perhaps dating back at least to the 17th century, and some have associated them with the Knights Templar.

> The caverns are located beneath privately-owned woodland. Since at least the 1980s, they have sometimes been used for informal secret ceremonies and rituals, and vandalised, and were closed to the public in 2012 as a result. Later reopened, they were accessed by a photographer in 2017, and received widespread publicity.

frikk 1 hour ago 2 replies      
An interesting observation from the photos is that you can see evidence of light vandalization (mostly "names" carved into the rock). This implies these caves are part of some hyper local knowledge (at least as a party room for teenagers).

- I wonder what the oldest vandalization is?

- Could it be that there are many others in the area, which is why this hasn't been formally discovered before? How many times has it been "rediscovered" in the last 700 years?

- Is it actually a rabbit hole (as in: a hole used by rabbits) or just a "door"? Seems likely that it "looks" like a rabbit hole, but is actually a (perhaps maintained) door into the cave.

brownbat 39 minutes ago 1 reply      
It looks like the archway is filled with holes for candles. Lighting for meetings could not have been as trivial a task as it was for the researchers, who seem to have been able to toss down some cheap super bright LEDs(?).

An hour of good lighting in a subterranean cave in those days must have cost a fortune. (Per Jane Brox, anyway...)


benmcnelly 18 minutes ago 0 replies      
They should have waited till April first to publish this, so people would think its a Monty Python reference :p
c3t0 33 minutes ago 0 replies      
Video walking through the cave on Youtube is worth watching. https://youtu.be/maDTJsGgmD0
hulahoof 1 hour ago 0 replies      
Be interesting to know more about the land owner, the title made it seem like they 'found' the entrance but the article implies the owner always knew about it
DoodleBuggy 37 minutes ago 0 replies      
And after publicizing the name and blasting the location onto the internet, the unique archeological site will be vandalized profusely and destroyed in no time at all in the name of selfies and social media points. At best, visitor access will be limited or prevented entirely.
saycheese 54 minutes ago 0 replies      
The reference to a rabbit hole makes no sense unless it's symbolic, since rabbits holes are about the size of a rabbit.
AlexB138 1 hour ago 2 replies      
There are a shocking number of low-effort reddit-like replies here. Can we please not turn Hacker News into yet another blackhole of "joke" replies?
quirkot 1 hour ago 0 replies      
Looks like the Knights Templar ripped off Morrowind. The architecture is nearly identical. Sad!
morsch 1 hour ago 0 replies      
With still-lit candles, no less!
wyldfire 1 hour ago 0 replies      
Likely the Killer Rabbit of Caerbannog
NikolaeVarius 1 hour ago 2 replies      
I thought there was only one Holy Hand Grenade of Antioch. How did they get past the rabbit?
chrisseldo 41 minutes ago 0 replies      
Is anyone going to say it? "That rabbit's dynamite!"
enjo 1 hour ago 0 replies      
Dan Browns reaction upon hearing the news:


An animated GIF that shows its own MD5 ccc.de
116 points by svenfaw  2 hours ago   32 comments top 17
soheil 40 minutes ago 0 replies      
Here is the explanation:

1. Generate a gif for each possible digit in the first column

2. Append collision blocks to each gif to make a 16 way collision

3. Repeat for each digit

4. Hash the final product

5. Replace each digit with the correct digit

From https://www.reddit.com/r/programming/comments/5y03g9/animate...

strictnein 1 hour ago 2 replies      
Took me a couple of minutes looking at this to realize why this was interesting.

It's like a baby being born holding its completed birth certificate.

MontagFTB 41 minutes ago 0 replies      
There's an explanation of how this 'attack' was composed here: http://crypto.stackexchange.com/questions/44463/how-does-the...
clishem 10 minutes ago 0 replies      
Title checks out.

 > md5sum md5.gif f5ca4f935d44b85c431a8bf788c0eaca md5.gif

z1mm32m4n 1 hour ago 1 reply      
Wow! I'd love to read about how this was made.
Exuma 30 minutes ago 0 replies      
It's like a quine almost. My favorite quine is...http://aem1k.com/world/ (view source)
quakeguy 1 minute ago 0 replies      
soheil 49 minutes ago 1 reply      
I wonder if it being an animated GIF as opposed to just an image has anything to do with it.
justindocanto 1 hour ago 1 reply      
Would love to know how this was made, if anybody knows
soheil 53 minutes ago 0 replies      
This is incredible, it's like winning the lottery except that you can play incredibly fast with almost no cost for each try.
mmanfrin 1 hour ago 1 reply      
I can't imagine how you'd begin to do this.
quirkot 1 hour ago 0 replies      
Kenji 1 hour ago 0 replies      
Now do a SHA1 one ;)
IAmGraydon 59 minutes ago 1 reply      
That is really amazing. How was this done?
svenfaw 1 hour ago 0 replies      
Writeup by @angealbertini coming soon
matreyes 1 hour ago 0 replies      
How did you do that!
Cloud Video Intelligence API cloud.google.com
182 points by hurrycane  4 hours ago   58 comments top 18
tyre 1 hour ago 1 reply      
I think their model should take a second pass on the words and probabilities, independent of the video.

Look at their example:

 Animal: 97.76% Tiger: 90.11% Terrestrial animal: 68.17%
So we are 90% sure it is a tiger but only 68% sure it is a land animal? I don't think that makes sense.

It could be that this is a weakness of seeding AI data with human inputs. I can believe that 90% of people who saw the video would agree that it is a tiger, while fewer would agree it is a terrestrial animal, because they don't know what terrestrial means.

tambourine_man 1 hour ago 0 replies      
It amazes me how smart these guys at google are, and yet, they can't design a mobile site if their lives depended on it:


sna1l 3 hours ago 2 replies      
I wonder if Snapchat is/will become a large user of this service? Depending on the average response time of this API, Snapchat could get much better ad targeting analyzing their Stories content.

I imagine that they have something similar in house that they run since it is pretty vital to their core business, but you never know.

skewart 3 hours ago 6 replies      
I'm curious about how much use these general-purpose computer vision APIs are actually getting. How many companies out there really want to sift through a lot of photos to find ones that contain "sailboat"? I'm inclined to think a lot more companies would want to find "one of these five different specific kinds of sailboats performing this action", which is definitely not among the tens of thousands of predefined labels that Google, and Amazon, offer with their general purpose models.

High-quality custom model training as a service seems much more compelling.

wyc 3 hours ago 1 reply      
I think the most commercially successful application of computer vision has been quality-control devices (citation needed). Agriculture is very interested in CV for a return-optimization technique known as precision farming. Manufacturers pay for inspection of production throughout the pipeline. To predict where a mass-market CV could be successful, I think we should look for industries with similar problems but cannot currently afford a bespoke custom modeling solution.
timc3 3 hours ago 1 reply      
I have been on the beta program for this and generally the results in our testing have been very good. I particularly like how granular the data can get.
imh 1 hour ago 1 reply      
The demo picture they chose is interesting. It's obviously a tiger, and is identified as such with only 90% probability. I appreciate the difficulty of the problem and how big of a success it is to achieve even that level of confidence, but that low level of confidence really shows how far we are from being able to simply trust computer vision. Still useful from an information retrieval perspective, I expect.
bitmapbrother 3 hours ago 2 replies      
It was really entertaining listening to Fei-Fei Lee talk about AI and ML at Google Cloud. If you get the chance check it out on YouTube. I especially liked how she referred to video as once being the "dark matter" of vision AI.
aub3bhat 3 hours ago 1 reply      
I think there is a need for a comprehensive system for image and video data analytics. Much like how we today have relational databases (postgres, MYSQL) and full text search engines (lucene/Solr). The approach Google or Amazon have been taking which involves providing a "tagging" API is frankly unimaginative.

I am working on Deep Video Analytics an Open Source Visual Search and Analytics platform for images and videos. The goal of Deep Video analytics is to become a quickly customizable platform for developing visual & video analytics applications, while benefiting from seamless integration with state or the art models released by the vision research community. Its currently in very active development but still well tested and usable without having to write any code.



frakkingcylons 1 hour ago 0 replies      
As a Cloud Prediction API user, it makes me a bit uneasy to see it left out of the image of their product suite. Is it effectively in maintenance mode now? I feel like TensorFlow is overkill for what I need and my use case doesn't fit into image/speech/video detection.
ar15saveslives 2 hours ago 3 replies      
Correct me if I'm wrong, but this is just a frame-by-frame labeling. You can download whatever pre-trained CNN, pass individual frames through it and get the same result.
zitterbewegung 2 hours ago 0 replies      
Not the first https://clarifai.com has a similar service .
soared 2 hours ago 0 replies      
Sounds similar to a company I worked with that took security camera footage from restaurants and identified employee theft and process inefficiencies.
joaoaccarvalho 1 hour ago 1 reply      
When you use these Google APIs, can Google keep/ use your data in any way?
kneel 1 hour ago 0 replies      
Cronenberg inception porn is coming
hartator 2 hours ago 1 reply      
It's awesome, but I can't really see any application beside content filtering and supericial content classification.
chimtim 2 hours ago 2 replies      
what is the "video" bit here? This is just running image recognition on a bunch of frames.
torechudi 1 hour ago 1 reply      
White House Echoes Tech: Move Fast and Break Things nytimes.com
12 points by fabrice_d  11 minutes ago   2 comments top 2
tptacek 2 minutes ago 0 replies      
I cannot fathom how Sam Altman thought that it would be a good idea to be a source for this article, let alone provide the specific quotes that he did. How does YC benefit from this? How does public policy benefit from it? "The Silicon Valley President in every way except the ideology is flipped"? "He did everything we tell our startups to do"? What's wrong with you?
abraae 0 minutes ago 0 replies      
A good philosophy for businesses that deliberately take on risk to maximise reward, and where the worst case outcome of "breaking things" is insolvency and loss of jobs and investor's money.
Study: Potatoes can grow on Mars upi.com
63 points by Mz  2 hours ago   42 comments top 10
david-given 1 hour ago 2 replies      
From the article:

"The box mimics the day-night patterns of Mars, as well as its temperature, air pressure and atmospheric composition."

Wait, really? They're growing plants in what's practically a vacuum and at subzero temperatures? Really? That seems... stupendously unlikely... to me. Not that I wouldn't love to be wrong, mind.

Anyone have a link to actual data? Everything linked to is just press fluff.

Update: Actual articles on growing plants at very low pressure:

https://science.nasa.gov/science-news/science-at-nasa/2004/2... --- at 1/10 of an atmosphere, plants' metabolic balance gets screwed up and they go into drought response and die, no matter how much water is actually available.

http://online.liebertpub.com/doi/abs/10.1089/ast.2009.0362 --- but lichen's probably fine.

guelo 1 hour ago 2 replies      
I think this article is wrong about the temperature. There is little information available but here is the original press release http://cipotato.org/press-room/blog/indicators-show-potatoes...

From what I can gather they used soil from some Peruvian desert and increased the CO2 in the chamber. But there is no mention of simulating the -80F night temperatures typical on Mars.

donpdonp 1 hour ago 2 replies      
I know, I saw Matt Daemon grow potatoes on Mars in an in-flight documentary called "The Martian".
nsxwolf 1 hour ago 2 replies      
"Scientists with the International Potato Center" made my day.
u801e 1 hour ago 2 replies      
They didn't mention whether they tried to simulate the radiation levels plants would experience on Mars in their experiment. I don't know whether that would change the results.
azernik 17 minutes ago 0 replies      
Why exactly did this experiment require a satellite? For biological purposes Martian gravity is probably closer to Earth gravity than to microgravity, and all the other conditions could be reproduced for cheaper and at larger scale in a ground laboratory.

This smells of a publicity stunt to me.

hossbeast 28 minutes ago 2 replies      
But how do they simulate Martian gravity?
DoodleBuggy 44 minutes ago 0 replies      
Hey, I read that book too.
petrikapu 1 hour ago 2 replies      
On video there were liquid water. I didn't know such exist in mars.
EGreg 52 minutes ago 1 reply      
But can they grow in Martian soil? Its regolith is missing crucial nutrients, methinks. Otherwise, wow! The Martian's premise might actually be realistic!
Instacart Closes Latest Funding Round at $3.4B Valuation bloomberg.com
99 points by rayuela  4 hours ago   126 comments top 15
apike 3 hours ago 7 replies      
In Vancouver, our most popular local supermarket chain (Save-on-Foods) has started offering a service where you can order groceries on the web or in an app at the same prices as in the store. Then, you can either quickly pick up the collected groceries for free, or you can pay them a fee for delivery if you prefer. Even though the chain is local and the service is new, the software works well, the service works well, and overall it's a good offering.

So my question is: how do investors project that an intermediary like Instacart can outcompete the same service provided by the supermarkets themselves? Doordash can compete on variety and speed, but most folks shop at the same one or two supermarkets all the time and rarely need groceries urgently.

webnrrd2k 1 hour ago 6 replies      
If there is as much money to be made as Instacart's valuation implies, then it's just a matter of time until the big chains, at least, start offering their own online shopping.

This strikes me as another version of AutoByTel.com Remember them? They worked between you and the dealership, and generally got you a much better price on a car, usually the fleet rate.

They did well until dealerships figured out how to use the web, and then the company hasn't been anywhere near as relevant for a long time. It tough existing between web-based customers and physical dealers. There isn't a lot of loyalty, so as soon as tgere is a better price customers will change sites. Also, Autobytel's business is really offering highly qualified sales leads to dealerships. The basics are nothing fancier than that.

I see the same thing for Instacart. Grocery stores tend to be effective at shipping stuff around, but tge same "shoping" experience has worked for many, many years.

It seems like the grocery stores have a lot more power than Instacart, and Instacart is going to have a very hard time getting between customers and stores. It's going to be hard to make a profit when stores offer free/cheap delivery and develop their own shopping apps.

saycheese 3 hours ago 1 reply      
For anyone that doesn't know how Instacart got into YC... "How Instacart Hacked YC":


jpm_sd 3 hours ago 2 replies      
It must be really weird to work for Instacart. The shoppers spend the entire day in the supermarket, shopping for other people, waiting in the checkout line over and over again. The drivers are just sitting around in the grocery store parking lot most of the time. [0]

[0] http://www.huffingtonpost.com/2015/02/02/instacart-workers_n...

largehotcoffee 1 hour ago 0 replies      
>$400 million round of venture fundingAs it stands now, I believe Instacart will fail and this money will be wasted. The only hope for the company is to be acquired by someone larger looking to bootstrap grocery delivery (or maybe Uber or Amazon or something).

I regularly use Safeway grocery delivery https://shop.safeway.com/ecom/home and I've always been happy. Delivery fees can get as low as $4 while grocery items are (I assume) the same price as inside the store.

Instacart needs to do something different. Why don't they just copy Blue Apron/Hello Fresh and offer meal packages with all the ingredients and recipe? Hell, I'd be way more interested in the service if I could get that along with groceries.


Blue Apron

Hello Fresh


Amazon Fresh

etc etc

These companies need to become each other, before Amazon becomes all of them. It seems like the biggest innovation from Instacart I could find was "the company wasn't collecting beverage can and bottle deposits accurately until recently, and the fix has increased gross margins by 25 cents on average deliveries nationwide.". This company is doomed.

Side note, I still can't get over the fact that I can't view a single item they offer without creating an account. Even going to browse their website for this comment, I walk away disappointed.

dopamean 3 hours ago 0 replies      
I've found Instacart to be prohibitively expensive. When in New York I've used Fresh Direct and have never felt that I was being over charged. I have never had an order with Instacart that made me thing the increase in price was worth it. I'm sure a lot of people like it but I'm having a hard time being one of them.
jzig 3 hours ago 2 replies      
It's from those markups on sales tax [0].


asciimo 1 hour ago 1 reply      
I expected Whole Foods to acquire them. They have so much infrastructure on site at popular Whole Foods locations that it seemed inevitable.
simonkjohnson 2 hours ago 0 replies      
Not entirely on topic, but in Denmark we have something similar called "Vigo", where other people buy your groceries for you, and deliver them to you. The fee you pay is fixed (about 6 USD).

People who want to make a little extra cash then log on to the app, look for tasks nearby, and pick a task they want to complete. Then they buy it, deliver it, and get paid.

The stores are partnering with the app to provide their inventory, so that you can pick what you need and expect it to actually be available for your shopper to buy.

Curious to see if something like Instacart would break through here.

jonnynezbo 1 hour ago 0 replies      
We use InstaCart to keep the fridge and pantry stocked at my small company. It works like a charm, and saves us nearly 2 hours a week. You get "free" deliveries with InstaCart Express for $149/year. The website and iPhone app are easy to use, and interaction with the shopper works almost flawlessly. Honestly, it's a no-brainer for us.
overcast 3 hours ago 14 replies      
What's the big deal with going to a grocery store, I mean seriously? I can understand a market for wealthy people, who just want someone else to do things for them, but to have enough market that the common person would find this worth it?

Also, who wants someone picking through all their produce/meats/perishables? That's a pretty subjective thing.

tabeth 3 hours ago 3 replies      
Am I missing something or are of these "I'm too rich to be bothered with this task" services just creating a new servant class in the United States?

This "servant class" has already existed, but now I feel as if it's being carved out of the already diminishing middle class.

misiti3780 3 hours ago 2 replies      
I just logged in for the first time and used a lower manhattan zipcode, the options seemed fairly sparse. I assume that it is more useful in other parts of the country?
Hydraulix989 3 hours ago 0 replies      
Those had to be some pretty bad terms.
CptJamesCook 3 hours ago 4 replies      
I have no comment on the valuation, but the service has made my life better.

I no longer fight with my significant other about grocery store trips, and the house is always stocked with healthy, delicious food from Whole Foods.

In 3 years, I'm not sure I've ever had a problem with an Instacart delivery, other than them occasionally forgetting items.

edit: I should have mentioned that I'm not very picky about what is actually delivered, I'm just happy to have a bunch of fresh vegetables, fruit, and sparkling water stocked in my house at all times.

Microsoft Pledges to Use ARM Server Chips, Threatening Intel's Dominance bloomberg.com
284 points by rayuela  8 hours ago   157 comments top 13
imglorp 6 hours ago 8 replies      
This could well be a negotiating strategy to rejigger the relationship with Intel. Dell did exactly the same thing a number of times for similar reasons.

So all you do is: put out some press releases, ramp up some hires, put out some glossy product roadmaps. Intel starts to get separation anxiety and flinches: well, maybe they can afford to come closer to AMD prices for another year. They kiss up and you back out of your false posturing. Rinse and repeat every few years.

ChuckMcM 3 hours ago 0 replies      
I love the fact that this can even be a threat. I'm biased, as an old Sun guy and thinking the x86 architecture was a bit too adhoc[1] and way too proprietary for really clever innovation. I had pretty much given up anyone challenging them in the server space assuming that 100 years from now kids would marvel that their brain implant could be switched into 'real mode' to run something called 'dos' :-).

This is something that I give full credit to Linus and the other developers that have made Linux into a credible server operating system. Without that software base, ARM would never have been able to get where it has.

[1] I get it that most people never see the 'insides' of their systems but its always been something I cared about.

foobiekr 2 hours ago 2 replies      
Has it been long enough for people to forget NT on Alpha, MIPS and x86 (and i860, though not released)? And to forget both PReP (https://en.wikipedia.org/wiki/PowerPC_Reference_Platform) and CHRP (https://en.wikipedia.org/wiki/Common_Hardware_Reference_Plat...) which were going to break the Intel stranglehold once and for all?
gigatexal 6 hours ago 1 reply      
This is awesome: with a renewed AMD and now ARM gaining another big proponent (Apple being the first imo) there might finally be some real competition to Intel's place at the top.
willvarfar 7 hours ago 2 replies      
> We wouldn't even bring something to a conference if we didn't think this was a committed project and something that's part of our road map."

As anyone who rode the rollercoaster of abandonment in the ActiveX years recalls, their previous MO was all about things that became uncommitted after being in their road map :(

I really hope there's some kind of future for non-Intel players.

zbjornson 7 hours ago 5 replies      
Competition is great, but this also seems like a rather big setback to the goal of having multi-cloud applications, in instances where the app can only run on x86-64.
ptrptr 2 hours ago 2 replies      
Question - is this a sign of maturity of ARM architecture? Can we really expect desktop OS to move to ARM? Could Apple start transitioning Apple AX into their laptops?
AlphaSite 7 hours ago 1 reply      
Im very curious how this will work for AMD with their K12 chip. They will have high performance x86_64 and AARCH64 chips.
frostirosti 1 hour ago 1 reply      
AMD really needed the break. This is fantastic news. Don't all consoles also use AMD chips?
hatsunearu 6 hours ago 2 replies      
Why though? ARM server chips haven't beat Intel just yet.
deepnotderp 8 hours ago 1 reply      
They're not replacing Intel chips from the looks of it though, right?
mtgx 7 hours ago 3 replies      
Payback time?


Okay, obviously this partnership started years back, but it's nice to see that not everyone is willing to encourage Intel's monopoly, as Google often does (in Chromebooks, too, even though Intel's chips are virtually unnecessary there).

Although, to be fair, the "Wintel" name didn't come out of nowhere. Microsoft obviously played its part in growing Intel's monopoly for a long time, too.

robert_foss 8 hours ago 5 replies      
For what?

Microsoft is not really a dominant force in the server space.

About Drums: the physics of overtones circularscience.com
44 points by camtarn  3 hours ago   6 comments top 4
algesten 1 hour ago 1 reply      
It's a great article, but the piano tuner in me must correct a detail about "typical musical instruments" such as pianos. The overtones are not necessarily perfect multiples of the base waveform and this is called inharmonicity.

In fact, this is part of why a piano sounds like a piano and guitar sounds like a guitar.

For any piano and especially the upright, the bass strings are actually too short to produce any vibration of the main frequency, the only thing you are left with are the overtones and our brains fill in the rest.

And that brain fill is actually happening across the entire range of the instrument, our brain latches on to specific overtones depending on interval, and the piano tuner (electronic or human), must compensate for inharmonicity in that range.

This means the bass must be tuned lower than the middle which in turn is tuned lower than the upper regions.



adamnemecek 1 hour ago 0 replies      
If you want to learn more about sound design[0], you should check out Syntorial http://www.syntorial.com/#a_aid=AudioKit. It's an interactive software synthesizer that teaches you in like an afternoon more than just about any book or video on the topic.

This tutorial series is also illuminating but it's almost too detailedhttp://sonicbloom.net/en/63-in-depth-synthesis-tutorials-by-...

You might also be interested in AudioKit https://github.com/audiokit/AudioKit, a (macOS|iOS|tvOS) framework for audio synthesis and processing.

[0] sound design is such an interesting field, as it's both vary artistic but also extremely math/physics/cs/stats if you want.

filmor 1 hour ago 0 replies      
A bit related (though much more theoretical): https://en.wikipedia.org/wiki/Hearing_the_shape_of_a_drum
dharma1 43 minutes ago 0 replies      
Folks at University of Edinburgh are doing some super cool stuff on physically modeled audio, including drums.


The Rusty Web: Targeting the Web with Rust davidmcneil.github.io
255 points by 314testing  9 hours ago   92 comments top 8
sunfish 8 hours ago 2 replies      
It looks like the slowdown on wasm is related to the powi call in the distance function (which the benchmark page mentions as a likely suspect). Emscripten currently compiles @llvm.powi into a JS Math.pow call, however this powi call is just being used to multiply a number by itself. I filed this issue:

to track this issue in Emscripten.

In JS, the assumption is that code may be written by humans, so engines are expected to optimize things like Math.pow calls with small integer exponents implicitly, which is likely why this code is faster in JS. And in the asm.js case, the page mentions that it's using "almost asm", which is not actually asm.js, so it's using the JS optimizations.

This is one of the characteristic differences between JS and WebAssembly: in WebAssembly, the compiler producing the code has a much bigger optimization role to play.

markdog12 8 hours ago 2 replies      
Unreal Engine's Zen Garden using WebAssembly: https://news.ycombinator.com/item?id=13820633
rubber_duck 8 hours ago 2 replies      
Interesting how when you read the articles about WASM and ASMJS you see stuff like "1.5-2x peformance loss over native code" but here it's 35-3x.

And this is a computation benchmark, it's not even hitting sand-boxed API overheads.

makmanalp 6 hours ago 3 replies      
As a sidenote, does anyone know how this emscripten stuff works? I understand how you can compile down to a different language, but is there a shim layer that converts OS level API calls to browser APIs? Or is it lower level than that? Either way, that sounds like a massive undertaking and it blows my mind that any of this is even possible.
floatboth 8 hours ago 2 replies      
"Native WebAssembly support is not available in this browser" Firefox Nightly. Something's wrong with the detection
CorySimmons 8 hours ago 1 reply      
A few comments are mentioning the implementation is probably dated.

Does anyone have benchmarks that tell a different story?

loppers92 7 hours ago 1 reply      
That sounds so good. I don't like javascript at all because it's a one of the worst programming language on our tiny world!

Almost every programming languages is better than javascript. The roots of javascript are for tiny special effects and not for huge well running applications. Typescript and all other frameworks for javascript are just workarounds for a buggy language.

In my opinion the final step is to replace completely javascript with something else!

Animats 4 hours ago 2 replies      
This is accomplished by compiling Rust to asm.js or WebAssembly.

This seems an overly complex approach. Just because you can do it doesn't mean you should. Most of the advantages of Rust don't apply when you're targeting the Javascript engine of a browser.

Modern Javascript, as a language, isn't that bad. Use the right tool for the job.

New island in North Sea to be hub for wind turbine power to 80M Europeans tennet.eu
37 points by flexie  3 hours ago   10 comments top 3
Johnny555 1 hour ago 2 replies      
I remember when the USA used to be able and willing to create interesting cutting edge technology like this. Now we're building oil pipelines, cutting car efficiency standards, and promoting coal. Anything that involves alternative energy has suddenly become "too expensive", "bad for jobs", or "liberal elitism".

Soon we'll have reverted back to the 1950's, which apparently is where "we" want to be -- even back to the cold war era nuclear ramp-up. I suppose we'll also be back to building back-yard bomb shelters, I guess that will help with jobs.

timthelion 1 hour ago 1 reply      
Maybe they would actually be better off without having green areas on the island. Tiny islands in the middle of the sea tend to be major bird magnets, and that doesn't seem to be a good mix with wind. It looks nice on the render, but it would be probably be better to make the island surface as inhospitable to birds as possible to make sure they don't get in the habit of flying there and gettign killed by the turbines.
mynewtb 1 hour ago 0 replies      
Not "to be" but "envisioned by tennet"! Sounds great though!
ScyllaDB Closes $16M in Series B Funding scylladb.com
32 points by bsg75  2 hours ago   21 comments top 6
TheGuyWhoCodes 55 minutes ago 2 replies      
Congratulations to the team!

I would love to move over from Apache Cassandra to Scylla but honestly I'm a bit afraid to do that. I have no doubt that it's much faster but I haven't seen hard numbers about consistency and availability. Apache Cassandra is a much older project with many installations and is battle tested (to a degree) how can I be sure that Scylla will behave as stable as Cassandra in that regard?

hendzen 1 hour ago 2 replies      
If I was Datastax I would be scared of Scylla. They have momentum and they are one of the best engineering teams around outside of top teams at Google/FB etc.
jdoliner 1 hour ago 4 replies      
Does anyone else find it a bit weird that a post on ScyllaDB's blog announcing their fundraising starts with: "ScyllaDB announced today that it?" It seems weirdly self referential to me. They're definitely not the only ones to do this though.

Anyways, congrats on the funding guys, certainly not trying to cast shade.

ram_rar 8 minutes ago 1 reply      
I like ScyllaDb. But I am not sure, why does it need to run on XFS only ?
rosslazer 16 minutes ago 2 replies      
I don't get it. Is this just a faster Cassandra? What's their competitive niche?
dominotw 23 minutes ago 1 reply      
Garbage collection has been the pain point for java based tech like hadoop. Interesting to see databases being written in Golang, which I imagine would have the same issues.
SJCL Stanford JavaScript Crypto Library github.com
123 points by remx  7 hours ago   83 comments top 16
colept 6 hours ago 2 replies      
Shameless plug: SJCL is a great library and easy to work with.

I've used it to build a non-profit decentralized encryption tool that can be used to send and receive files that will self-decrypt using SJCL, JavaScript FileReader, and HTML5 download attributes.

User A creates a password to encrypt a file using this client-side mechanism - which produces a self-decrypting HTML file. User B opens this HTML file in their browser which will ask them for the password to decrypt the file allowing them to download the original file all without a server. The homepage can be downloaded and self-hosted at will.


edit: I have uploaded it to Github today for easy self-hosting: https://github.com/colepatrickturner/zipit

jpgoldberg 5 hours ago 2 replies      
It is important to separate three security concerns:

1. Crypto delivered to the browser over HTTPS depends on the integrity of HTTPS.

2. A browser is a very hostile environment (injected JS, other browser extensions, etc.)

3. JavaScript may not be the best language for coding certain things (e.g., it is hard to remove strings from memory)

Depending on your use, some of these might be larger concerns than others. For us, 1Password.com, (1) is the biggest concern for those using the web-app. Our approach is to be very strict about TLS (TLS 1.2 only, HSTS, etc) and to encourage use of the native clients over the web-app.

lucideer 6 hours ago 2 replies      
In every one of these threads, there are inevitable comments along the lines of "in-browser crypto is inherently unsecure", which often follows to a more general "javascript crypto is inherently unsecure".

This question might be slightly off-topic here, given this seems to be an in-browser library, but can anyone who knows a bit more about this topic than I comment on the state of out-of-browser JS crypto (e.g. NodeJS). The browser as environment does seem to introduce a stigma around security to the entire JS ecosystem, and I wonder if its warranted.

0XAFFE 2 hours ago 1 reply      
How does this compare to libsodium(.js)[1][2]?

[1] https://github.com/jedisct1/libsodium[2] https://github.com/jedisct1/libsodium.js

aabajian 1 hour ago 2 replies      
One of my first jobs involved encrypting medical data in the client-side browser. The plan was to generate a new RSA private/public key pair whenever a new user joined. The keys themselves were encrypted using the user's password (w/AES) on the client side. We stored the encrypted RSA keys and the user's password hash on a server. When a user logged in we would validate the user's password hash, and return the user's encrypted keys. The user would have to know their password to decrypt the RSA key. The theory was that no medical data was visible on our server, even to us. We also used standard SSL for all connections.

The challenge was implementation. At the time there were no AES block-cipher JavaScript modules. My solution was to use GWT to compile BouncyCastle's Java library into JavaScript. I had to strip the library of all low-level I/O calls and replace the BigInteger class with a GWT-JSNI-wrapped version of the BigInteger class I found online. That BigInteger class was a side-project of a Stanford graduate student: http://www-cs-students.stanford.edu/~tjw/jsbn/

He also included a JavaScript implementation of RSA. Long story short, within a year there were several cryptography libraries, but I like to think we were ahead of our time.

Quiark 6 hours ago 3 replies      
Browser crypto is really problematic. For one, the client is downloading the crypto JS implementation (almost) every time they are using the app. Server compromised? crypto.js is rendered useless. HTTPS certificate compromised? crypto.js is useless.

And also:

XSS vuln on your website? crypto.js is useless since attacker will just exfiltrate your private key / password through XSS.

The browser is wonderful for UI but mashing together code and markup and then trying to permissively parse and execute it never goes well for security.

gbrown_ 4 hours ago 3 replies      
For someone not really familiar with javascript could someone explain how sensitive information is removed from memory when no longer needed? Does the runtime expose semantics to ensure memory is zeroed?

Also given javascript is garbage collected are there concerns about timing attacks? Or can you prevent the GC from running in critical sections like Go (I assume Java offers this too)?

baby 6 hours ago 1 reply      
More interestingly, and recent, there is Web Crypto: https://developer.mozilla.org/en-US/docs/Web/API/Web_Crypto_...

Although I have no idea about the state of the thing.

fenguin 4 hours ago 0 replies      
We've been using SJCL for 4 years in our browser extension to encrypt incognito mode bookmarks [1] and one of the things we've really appreciated is its backward compatibility -- version updates have been complete drop-in replacements which has made developing on this very pain-free.

[1] https://hushbookmarks.com

blablabla123 7 hours ago 4 replies      
There's a discussion going on of people who claim doing Crypto in the browser is super insecure. (Crockford is one of them) I wonder to what degree this is true or what needs to be done to do at least some basic crypto in the browser like AES for personal user data.
calebm 6 hours ago 0 replies      
On a related note, I made a little browser-based file encryption app: https://hypervault.github.io/ which uses the Triplesec library (https://github.com/keybase/triplesec), which seemed a nice alternative to SJCL.

Since I know someone will bring up the insecurity of browser crypto: this app is entirely self-contained (the HTML file can be saved and run offline).

nidx 4 hours ago 0 replies      
I use in browser crypto to get "better than plaintext" encryption for my login pages. I am dealing with a server infrastructure that won't let me easily or cheaply add https to client sites. I use a 512BIT RSA Key Pair regenerated every 5-10 seconds. I know it could be MITM'ed or Brute forced in about a day or two. Its not real security, but it is better than nothing. I wanted to stop sniffing for http logins (would have mitigated that cloudflare issue last week).
singularity2001 1 hour ago 0 replies      
Does it contain elliptic NSA curses / curves ?
Sharma 4 hours ago 0 replies      
I am 99% sure this post was submitted to HN after user read following comment yesterday:


woranl 6 hours ago 6 replies      
Is there a market for end-to-end encrypted app in the browser?
libertymcateer 4 hours ago 0 replies      
So, hopefully without shooting my mouth off (this time), here are some lessons I have been learning (the hard way) about SJCL:

* The SJCL library appears to be really great. It is super simple to use and has great documentation. Dan Boneh has a reputation that precedes him. I've been pointed in the direction of his work multiple times, independently, by different sources. [0]

* The SJCL library is only going to be as good as its implementation in the browser. For instance, how is it getting in the browser? Is it stored in an extension, or is it being loaded dynamically? If it is loaded dynamically, you are going to be vulnerable to a whole host of attacks, including cross-site and MITM. That aside, even if it is loaded dynamically and it gets there in one piece, is it loaded into the runtime securely? This last one is the real kicker for me. If you are implementing it as injected code onto a third party page, you are leaving it open to trivial manipulation by third parties. I was savaged by Nikcub for this, and rightly so. For embarrassing exchange, you can see my history.

* What is the threat model? Browser-based encryption is only ever going to deal with certain threat models - as we are seeing in the recent days, there is good reason to assume that end-point security on a host of devices may be compromised by state actors. Please note I am not declaring this to be the case based on my authority - but it seems like in a world where we are getting leaks off of handsets prior to encryption by Signal and Whatsapp, assuming end-points are secure is a big assumption. Additionally, I know the wikileaks Vault7 only list compromises up to Chrome ~35 and for Android below Kitkat [1], I've read this list is likely dated, and I don't think it is fair to assume that the development of zero days has stopped there. Accordingly, if you are trying to prevent intrusion from state actors, then there is reason to suspect that browser-based implementation will never get you there.

* Key generation and exchange remains an issue. Lots of people more qualified than I state that javascript RNGs are just not that great, which can significantly reduce entropy on keys. [2] On top of this, I want to talk about entropy more at length, in particular reference to SJCL: SJCL goes to some length to create 'random' (I personally cannot verify this) salts and initialization vectors. However, as far as I can tell, you have two choices: you either store those separately and transmit them separately from the message, which creates a whole host of issues in the transmission and storage of those salts, or you send them with the message. As far as I can tell, if you send them with the message, the extra entropy they introduce is not relevant for warding off brute force attacks or attacks based on trying to compromise the password (e.g. dictionary attacks), but are only useful against crib-based attacks or other cryptanalytic attacks - which, again, as far as I can tell, if you are going up against the sort of entity that has the resources to actually try and crack AES128 or AES256 by attacking the cipher, rather than the key, I suspect you are dealing with some very nasty people and using javascript crypto is not your best bet.

* Importantly, and critically, security is a conclusion, not a feature. Adding SJCL onto a communications protocol is not going to make it secure. In fact, it has been expressed by people better than me that that the author of software cannot self-authenticate that it is secure.[3] It needs to be subjected to third party (and, ideally, public scrutiny). So, in the end, if you are going to be using a library like SJCL, it is important to have the particular implementation tested by disinterested third parties. Though the math and code behind SJCL may be secure, actually getting it into a piece of software that people want to use introduces a gigantic raft of issues.

On background, the reason I know this as a (software) lawyer is because I have been working with SJCL on a node based application for quite some time. I do not represent it is secure - if I have in the past, that was in error and an oversight on my part (hat-tip to all the people on HN who have very rightly pointed this out). However, working on it has been extremely instructive and has confirmed what I always suspected to be true - if you want to be able to say something is secure, you need to be working with people who work on security as a primary occupation, not a hobby or a side-interest. It is too enormous, complex and ever-changing a field for anyone to be an 'expert' at it unless it is their primary concern.

As always, interested in any feedback or counterpoints. Especially on the math.

[0] http://crypto.stanford.edu/~dabo/ ; https://www.linkedin.com/in/dan-boneh-8b599020/

[1] https://wikileaks.org/ciav7p1/cms/page_11629096.html

[2] http://stackoverflow.com/questions/17280390/can-local-storag... - hat tip to https://news.ycombinator.com/user?id=bmh_ca , who, based on this thread, I have discovered was the author of that post on stackoverflow, which is instructive.

[3] https://www.schneier.com/blog/archives/2011/04/schneiers_law...

Show HN: A webrtc library for writing multi-user virtual reality in the browser github.com
25 points by haydenlee  2 hours ago   7 comments top 3
redka 21 minutes ago 1 reply      
Looks very neat. I'm curious how it syncs the entities. Does everyone send to everyone? I'd imagine you couldn't have too many clients at once if that was the case.
haydenlee 2 hours ago 1 reply      
Developer here. Appreciate any feedback! The goal is to make it really easy to write multi-user / multi-player / social virtual reality experiences on the web, on top of a VR framework called A-Frame.
mLuby 1 hour ago 1 reply      
Very cool! Would be nice to be able to chat. :)
Thinking About Recursion solipsys.co.uk
26 points by ColinWright  3 hours ago   9 comments top 6
Koshkin 8 minutes ago 0 replies      
To those who might think that recursion is not all that useful in practice of commercial software development (as opposed to purely academic interest or as a method of solving logical puzzles) I would point out the fact that recursion is the only way to emulate iteration in functional languages and, as such, it finds extensive use in the C++ template language as well as XSLT - two most used pure functional programming languages today.
akkartik 1 hour ago 0 replies      
I feel fairly confident that I've solved the problems of teaching functions (http://akkartik.name/post/mu ; particularly the last couple of sections) and recursion (http://akkartik.name/post/swamp). Tl;dr - It just requires first introducing these ideas in a statement-oriented language without recursive expressions. Allow learners to see the steps unfold explicitly, one by one.

(Comment deliberately phrased provocatively to hear push-back and feedback.)

jonsen 2 hours ago 0 replies      
"Let me see if I got this recursive disk juggling right. You move the top-most disk to position C. Then recursion can move the rest to position B. And finally I move the disk at position C on top of the rest at position B."

"No, no, no. Look ..."

"But to start you can only move the top-most disk?"

"Yes, but it won't work because ..."

"What? You have to move the top-most as the first move."

jsperson 1 hour ago 1 reply      
I just spent an hour reading this blog. I now feel absolutely smarter and relatively dumber. What a wonderful blog. Thanks for sharing!
Chatbot that overturned 160k parking fines now helping refugees claim asylum theguardian.com
179 points by walterbell  4 hours ago   65 comments top 8
roywiggins 1 hour ago 1 reply      
I bet you could do a lot of good with an equivalent for eviction proceedings. Most people being evicted don't have representation.
omash 2 hours ago 5 replies      
Can someone explain how the first chatbot was any better than an interactive form?
jlev 3 hours ago 2 replies      
Great to see more chatbots that encourage real world civic action, not just make it easier to order pizza.

I did similar work with Hello.Vote for voter registration, and there are so many offline forms that could be improved with a bit of web/messaging UI. Also see snapfresh.org for a CfA project that helps find places that accept EBT cards.

fnbr 3 hours ago 1 reply      
What worries me about this is the liability- if you mess up someone's asylum claim, they can die, or suffer extreme harm (e.g. if they're tortured). I'd be interested in seeing how DoNotPay manages their liability.
dao- 3 hours ago 2 replies      
> The 20-year-old chose Facebook Messenger as a home for the latest incarnation of his robot lawyer because of accessibility. It works with almost every device, making it accessible to over a billion people, he said.

That can't be the whole story... A web app could have been at least as accessible. I guess Facebook offers APIs that make developing this kind of application easier than looking for other frameworks or starting from scratch.

elandybarr 3 hours ago 5 replies      
So what happens when all 10.5 million Somalians, 22.85 million Syrians, 24.41 million Yemenis, 33.42 millions Iraqis, 37.96 million Sudanese, 6.35 million Eritreans, 77.45 million Iranians, etc apply?
willyyr 3 hours ago 1 reply      
Anyone knows what technology he is using in the background? Is it Facebooks own BOT-Framework?
emodendroket 4 hours ago 1 reply      
I think one of these processes is a lot easier than the other.
Patagonia and The North Face theguardian.com
13 points by waqasaday  2 hours ago   5 comments top 5
pizzetta 10 minutes ago 0 replies      
Their "rivalry" reminds me of the Canon/Nikkon co-existence model. Where they competed but in slightly different market segments. One would have lenses for x0type of photography, while the other had then for y-type photography. Or one would have great sensor in some dimension (color vs bleed) etc... And they both co-exist without annihilating the other one -although Nikkon seems to have made some poor decisions of late (and nixed one of their most anticipated products due to costs and ongoing product issues)
sizzzzlerz 39 minutes ago 0 replies      
I've still own a North Face down jacket I received as a birthday gift 45 years ago. It doesn't fit me any more and could use a good cleaning but it is still wearable and useful for keeping warm. I also had a tent I used regularly for more than 20 years under all kinds of weather until the floor finally gave out. They made really quality stuff back in the day, expensive, yeah, but it lasted.
burntrelish1273 33 minutes ago 0 replies      
REI, Land Rover and most high-end mountain bikes arguably sit in the same microfiber travel chair that folds up into a credit-card: ostensible backcountry use luxury city gear.

Some motivations:

0. In affluent/flat societies, status is everything, even with home-improvement, camping and other common items. Showing overpaying for shit that doesn't matter is SharperImage's whole business model.

1. Some people want non-crappy, daily-practical, multi-use items: sure an over-priced rain jacket shell is good for running but it also fits into a glovebox. Or a sleeping pad that collapses down and fits in the closet behind all the other camping gear used at home.

Btw I bought a $250 Colombia Titanium GoreTex jacket with an actual lifetime warranty some years ago that's deteriorated not from normal W&T. Obviously, it cost them probably $15-20 landed to make it so it makes sense to honor such warranties for a time.

gdubs 50 minutes ago 0 replies      
There's a great podcast series from NPR called How I Built This. They recently interviewed Patagonia's founder Yvon Chouinard:


I found it inspiring. If you're unfamiliar with the podcast, it's focused on makers and entrepreneurs.

hprotagonist 47 minutes ago 0 replies      
Yvon Chouinard is a real fun guy. (and a stonemaster of old, so he's gotta be OK..)

Politics aside, I am thrilled with Patagonia's repair policy. They've repaired a pair of jeans for me 7 times now, several hoodies a few times each ... the up front cost for goods is high, but the durability and service life more than makes up for it.

ANTLR Mega Tutorial tomassetti.me
201 points by ftomassetti  10 hours ago   55 comments top 12
musesum 8 minutes ago 0 replies      
I used Antlr v3 to create a NLP parser for calendar events for iOS and Android. It took longer than expected. iOS + C runtime was opaque, so had to write a tool for debugging. Android + Java runtime overran memory, so had to break into separate grammars. Of course, NLP is not a natural fit. Don't know what problems are fixed by v4.

> The most obvious is the lack of recursion: you cant find a (regular) expression inside another one ...

PCRE has some recursion. Here is an example for parsing anything between { }, with counting of inner brackets:


A C++11 constexpr can make hand coded parsers a lot more readible, allowing token names in case statements. For example , search on "str2int" in the following island parser: https://github.com/musesum/par

jasode 8 hours ago 8 replies      
I last played around with ANTLR in 2012 (when it was version 3) and I discovered that there's a "bigger picture" to the parser generator universe that most tutorials don't talk about:

1) ANTLR is a good tool for generating "happy path" parsers. With a grammar specification, it easily generates a parser that accepts or rejects a piece of source code. However, it's not easy to use the hooks to generate high quality diagnostic error messages.

2) ANTLR was not good for speculative parsing or probabilistic parsing which would be the basis of today's generation of tools such as "Intellisense" not giving up on parsing when there's an unclosed brace or missing variable declaration.

The common theme to the 2 bullet points above is that a high quality compiler written by hand will hold multiple "states" of information and an ANTLR grammar file doesn't really have an obvious way to express that knowledge. A pathological example would be the numerous "broken HTML" pages being successfully parsed by browsers. It would be very hard to replicate how Chrome/Firefox/Safari/IE doesn't choke on broken HTML by using ANTLR to generate an HTML parser.

In short, ANTLR is great for prototyping a parser but any industrial-grade parser released into the wild with programmers' expectations of helpful error messages would require a hand-written parser.

Lastly, the lexing (creating the tokens) and parsing (creating the AST) is a very tiny percentage of the total development of a quality compiler. Therefore, ANTLR doesn't save as much time as one might think.

I welcome any comments about v4 that makes those findings obsolete.

CalChris 9 hours ago 2 replies      
I switched over to ANTLR 4. It is strictly superior to ANTLR 3. The listener approach rather than embedding code in the grammar is very natural. Separating leads to clean grammars and clean action code. Odd thing is that I was stuck on 3 because 4 didn't support C yet and then I just switched to the Java target in an anti-C pique. Shoulda done that awhile ago.

TParr's The Definitive ANTLR 4 Reference is quite good. And so's this mega tutorial.


ANTLR is my goto tool for DSLs.

raverbashing 9 hours ago 4 replies      
Just a note on "Why not to use regular expression". Because it's impossible depending on the language complexity

REs are level 3 https://en.wikipedia.org/wiki/Chomsky_hierarchy

ttd 6 hours ago 1 reply      
I think everyone should manually implement a simple recursive descent parser at least once in their careers. It's surprisingly easy, and really (in my experience) helps to break through the mental barrier of parsers being magical black boxes.

Plus, once you have an understanding of recursive descent parsing, it's a relatively small leap to recursive descent code generation. And once you're there, you have a pretty good high-level understanding of the entire compilation pipeline (minus optimization).

Then all of a sudden, compilers are a whole lot less impenetrable.

nradov 3 hours ago 1 reply      
I used ANTLR to write a fuzz testing tool which parses an ABNF grammar (like in an IETF RFC) and then generates random output which matches the grammar. Worked great!


pjmlp 9 hours ago 1 reply      
Great tutorial, ANTLR is one of the best tools for prototyping languages and compilers.

I wasn't aware it supports JavaScript nowadays.

In any case, good selection of languages.

closed 3 hours ago 0 replies      
This tutorial looks great. I picked up Antlr4 a few months ago, and hadn't done any parsing before then. The first week was basically me, The Definitive Antlr4 Reference, and extreme confusion with how different targets worked. Compounding the problem was the fact that a lot of the antlr4 example grammars only work for a specific target. The use of different language implementations as part of this tutorial seems really useful!

(Antlr4 is awesome :)

betenoire 3 hours ago 0 replies      
These types of tutorials always start out explaining the problems with Regular Expressions and why not to use them... then immediately proceeding into lexing via regular expressions.

Perhaps the tutorials should start with the strengths of regular expressions, and how we can harness that for getting started with a lexer.

poppingtonic 5 hours ago 0 replies      
I used ANTLR to write a Python parser for the SNOMED expression language last year, and testing it was one of the weirder parts of the experience. I was up and running in a few days, which was largely thanks to the ANTLR book. I love this project. It made doing what I did a lot more fun than I thought it would be. Hand-rolling an ABNF parser from scratch would be a nice hobby project, but not when one has a deadline.
intrasight 7 hours ago 2 replies      
For .Net projects, I've used Irony. From the CodePlex site:

"Unlike most existing yacc/lex-style solutions Irony does not employ any scanner or parser code generation from grammar specifications written in a specialized meta-language. In Irony the target language grammar is coded directly in c# using operator overloading to express grammar constructs. "

destructaball 7 hours ago 2 replies      
What are the advantages of ANTLR over something like Haskells Parsec?


Show HN: Gitly.io high performance Git service with a 10s installation time gitly.io
85 points by alex-m  2 hours ago   49 comments top 15
Analemma_ 41 minutes ago 1 reply      
It's so cool to see people focusing on web app performance. Everyone complains about it but no one follows up their complaints with action. That's a big plus and I will definitely keep my eye on this.

This is going to sound petty, but I think it's important: you might want to come up with a better name. "Gitly.io" is, like, three different regrettable startup naming fads in one.

drizze 5 minutes ago 1 reply      
Why is it that the C++ code is pink in one screenshot, but blue in another? And Python is blue in one screenshot and pink in another?

Looks like a great project! Thanks for sharing!

aphextron 25 minutes ago 2 replies      
This looks promising. Can't wait to check out the source. How exactly do you plan on monetizing while allowing unlimited free private repos?

Also, for anyone looking for a good self-hosted git service like this, check out the existing open source project GOGS https://gogs.io/

ErrantX 1 hour ago 1 reply      
I like the sub-domain approach. It's fast and nice enough. I'd focus on UI consistency; pages have different navigation elements. Issues trello-style board is cool but let down by over-simple styling.

FYI the default avatar image (https://errant.gitly.io/img/avatars/default.png) seems to be missing? On the issues page, JQuery UI seems to be missing too.

If you keep working on it I'll probably keep using it :)

skykooler 1 hour ago 1 reply      
> Self-hosting will be available on April 1, 2017.

Really, or is that just an April Fool's joke?

movedx 46 minutes ago 1 reply      
Just so you know, your email for verifying an account signup comes through with a pretty high spam score (for me at least.) The first email was rated 4.5 and the second 5.9 (I signed up two accounts: personal and business.)

I use Fastmail as my provider.

mtrn 42 minutes ago 1 reply      
Very nice. It really took me just 10s seconds to sign up and clone a repo. I believe this kind of easy of setup should be much more common - across many domains and languages.

That said, details will matter in the longer run. The markdown and code rendering on github has gotten such a large amount of attention over the years, that it's hard to compete.

dlehman 13 minutes ago 1 reply      
Can you clone a bitbucket repo, or only GitHub?
alex-m 2 hours ago 6 replies      

Gitly is my side project I've been working on for a couple of months. It is an open source repository manager with a focus on performance, ease of use, and productivity (especially for larger projects).

It's still in an early alpha stage, a lot of features are missing. The source code and the ability to self-host will be available within a month.

There's GitHub, Bitbucket, GitLab, gogs. Why create another solution? Gitly has been designed to be very fast and ridiculously easy to maintain. It is much faster than all of the above. It also offers a couple of unique features.

You can self-host gitly in 10 seconds. It has no dependencies, and doesn't require a database or a web server. Updates are automatic and seamless. It's easier to set up than gitweb! At the same time gitly is going to have the same features GitHub/GitLab offer, and even more.

How fast is gitly? Every single page takes less than 0.5s to load, no matter how big the project is. There are no JS libraries used (in fact, there's barely any JS at all), so the client side performance is great.

You can host your repositories on gitly.io, and it offers the same high performance. Cloning the entire Spring framework on gitly.io takes 11 seconds. On GitLab.com it took 7 minutes and 50 seconds. Of course GitLab.com is massive. However, the way gitly was built, performance will always be this good: it can be scaled horizontally very easily. And if you host it locally, the cheapest 256 MB instance should be enough for most users.

Gitly works great with large projects. I successfully tested it with a 10 year old repository with 4 million lines of code. Bitbucket took almost an hour to cache. Gogs crashed, and I didn't manage to install GitLab to test it locally after trying for 2 hours.

Many of the unique features are to improve productivity and help understand the code base better, which is very useful for large projects.

One of them is called "top files". It shows the largest files in any directory of the repository on one page with detailed language stats. Here's how it looks like for the Spring framework:


Another unique feature is language stats. Gitly displays detailed language stats for every single directory. This can give a better picture of the structure of the project.

One of my favorite features is the search. Unlike all other search engines, the one in gitly will search exactly what you asked for: character by character.

For example, if you are a Rust developer, and you need to search for the following declaration:

fn next(&mut self) -> Option<Self::Item> {

You type it, and you only get the results you are interested in:


Gitly also has discussions, which is like a simple forum where you can discuss the project, ask questions and so on. Mailing lists will be integrated as well.

Like most other solutions, gitly has a Trello-like issues board, and you can import all your Trello boards.

It's still an alpha. There are a couple of rough edges (e.g. markdown support is not great). Here are the missing critical features that will be implemented within the next two weeks:

- Forking, pull requests, code review.- User profile (password change, SSH keys etc)- SSH support (only HTTPS for now, making SSH authorization secure takes time)

Some of the upcoming functionality I'm excited about:- Go to definition support for most popular languages- Issues as part of the repository, so that it's possible to manage them locally- Pull request interface that would make Linus happy- Search in commit history- A way to organize a large amount of repositories

Thanks for your time, looking forward to your feedback.

romuloab42 57 minutes ago 1 reply      
It is so refreshing to see slim, fast websites. Only the minimal JS is used. Even though I don't have many repositories, I'll certainly consider supporting gitly.

Please remain true to your current vision.

Just a minor point: although the website uses so little JS, the Signup form is broken without it.

EDIT: by broken I mean it returns a JSON instead of an HTML page or HTTP redirect.

koolba 1 hour ago 1 reply      
Always nice to see competition in this space. Keeps people on there toes though it's a steep hill to climb feature wise to bring out something as polished as the alternatives.

> Self-host in 10 seconds

> You can run gitly on your own server. It requires no installation or maintenance. You don't even need a webserver or a database. (Self-hosting will be available on April 1, 2017.)

Maybe April fools day isn't the best choice for a future release date? Though I guess it worked for Gmail...

baal80spam 48 minutes ago 2 replies      
Looks great. May I ask, how can you provide free self-hosting (even if it's free only for <5 users)?

edit: I didn't receive email confirmation yet, after signing up (15 minutes ago). HN server effect?

floatboth 39 minutes ago 0 replies      
Wow, the design is so "inspired" by GitHub the language percentage thing especially
hdhzy 50 minutes ago 1 reply      
Very cool. I've been thinking about something like that for a long time.

I also don't mind paid solution if it'll be open source.

justinsaccount 1 hour ago 1 reply      
What's it written in?
The Origins of the Linguistic Conception of Programming (2014) [pdf] uva.nl
65 points by mpweiher  7 hours ago   20 comments top 3
anigbrowl 5 hours ago 4 replies      
Extremely interesting. I really dislike writing code because you can't get any sense of the structure by looking at it - it's like reading a description of a painting in book form. It's fascinating tos ee the historical context that led to that becoming a dominant paradigm.
joe_the_user 2 hours ago 0 replies      
This looks like a marvelous inquiry with a lot of "big picture" implications.

"All of these approaches, however, take the linguistic nature of computer programming for granted. Indeed, it is surprising how rarely language appears in the list of relevant programming metaphors, despite periodic attempts to envisage program code as a form of literary expression. It is as if we have become so accustomed to think of programming languages as languages that we forget that this analogy has its own history."

And let's not forget that the idea of "a language" also has it's history, with Chomsky shaping the concepts of linguistics using various mathematical tools during the period that the idea of "a computer language" arose. A given computer language can more easily be located on the Chomsky hierarchy than a human-spoken-language, just for example.

pron 2 hours ago 0 replies      
In 1947, Alan Turing said the following in a lecture to the London Mathematical Society[1]

> I expect that digital computing machines will eventually stimulate a considerable interest in symbolic logic and mathematical philosophy. The language in which one communicates with these machines, i.e. the language of instruction tables, forms a sort of symbolic logic. The machine interprets whatever it is told in a quite definite manner without any sense of humour or sense of proportion. Unless in communicating with it one says exactly what one means, trouble is bound to result. Actually one could communicate with these machines in any language provided it was an exact language, i.e. in principle one should be able to communicate in any symbolic logic, provided that the machine were given instruction tables which would enable it to interpret that logical system. This would mean that there will be much more practical scope for logical systems than there has been in the past. Some attempts will probably be made to get the machine to do actual manipulations of mathematical formulae. To do so will require the development of a special logical system for the purpose. This system should resemble normal mathematical procedure closely, but at the same time should be as unambiguous as possible. As regards mathematical philosophy, since the machines will be doing more and more mathematics themselves, the centre of gravity of the human interest will be driven further and further into philosophical questions of what can in principle be done etc.

He added (not related to the topic of language per se, but still amusing):

> It will be seen that the possibilities as to what one may do are immense. One of our difficulties will be the maintainence of an appropriate discipline, so that we do not lose track of what we are doing. We shall need a number of efficient librarian types to keep us in order.

[1]: http://www.vordenker.de/downloads/turing-vorlesung.pdf

From Vilified to Vindicated: The Story of Jacques Cinq-Mars hakaimagazine.com
16 points by hownottowrite  3 hours ago   1 comment top
dmix 34 minutes ago 0 replies      
While I'm sympathetic to Cinq-Mars and the amount of time it took to become vindicated, I don't find much fault in the original scientists who demanded more evidence than was available.

It seems he only had the one bone with indication of human butchery. It wasn't until decades later that another scientist was able to review hundreds of bones and found 15 other examples.

This must be considered in context. Someone claiming that humans were in North America ten thousand years earlier must automatically bring skepticism. Were there many charlatans at the time, or in history, making similar claims that were proven false?

The only question was why it took so long for enough resources to be invested in fully researching the cave? How much was it the fault of lack of funding?

VMware joins Linux Foundation: what about the GPL? gnumonks.org
196 points by JoshTriplett  4 hours ago   67 comments top 9
bcantrill 3 hours ago 3 replies      
I understand the consternation -- I have very mixed feelings about the Linux Foundation (I am on the TOC of the CNCF, an LF project), but the reality is that as a 501(c)(6) they are much better resourced than the 501(c)(3)s dedicated to open source. In my opinion, the LF therefore represents our (current) best shot to achieve some common good, and (speaking personally), I have resolved to work from within the LF to make it more useful to the broader constituency of open source communities. (As a concrete example, I championed the LF/CNCF acquiring and relicensing RethinkDB[1].)

Similarly, I think applying a purity test to the LF would be counterproductive: it would end up refusing essentially everyone's money and becoming the FSF or Apache Foundation -- two flat-broke 501(c)(3)s. (Aside, individuals should never ever donate to the LF; donate to a 501(c)(3) instead.)

To sum: I would rather have VMware in than out, and then influence the LF to abide by its broader open source constituency, funding projects at the grass roots that are the lifeblood of open source -- but I also understand those that would view taking VMware's money as selling out that constituency!

[1] https://www.joyent.com/blog/the-liberation-of-rethinkdb

ChuckMcM 3 hours ago 3 replies      
While I respect the angst the author feels, this snippet ... "... allowing an entity like VMware to join, despite their many years long disrespect for the most basic principles of the FOSS Community (such as: Following the GPL and its copyleft principle), really is hard to understand and accept."

I think this presumes a different rationale on the part of the Linux foundation. The answer to the question "Here is some money, can I join?" is always yes. And with enough money it is "heck yes!"

And that leads to "I guess I was being overly naive :("

Which is yes, you were.

Why do politicians, managers, and ex-lovers all have to just "put their differences aside and move on." ? Because if you don't you can't make any progress even if the penultimate step here was one that hurt you badly.

The best way to score this is that VMWare is going to participate in a forum with others who can help guide them to a better compliance record. Both to understand why compliance is good, and to alleviate fears that VMWare might have that complying would leave them vulnerable. What they were before is in the past, now you have a way to work with them to move forward. Consider the alternative that they don't join, they continue to be huge FOSS scofflaws and continue to provide some sort of 'you don't really have to comply' example to others who might be uncomfortable with the GPL. When you look at it this way, it really is a good thing.

andrewshadura 3 hours ago 0 replies      
Well, it's sort of well-known that key Linux Foundation figures are openly against GPL enforcement, even in its most harmless forms such as those SFC does.
arca_vorago 3 hours ago 2 replies      
GPLv3 is the fix, but Torvalds refuses to apply it to linux... root of the problem imho. When I can have a gplv2 linux on my samsung TV that gets rooted by the CIA but I can't root it without bricking it, you know it's time to switch to gplv3.
guelo 2 hours ago 1 reply      
The GPL is just incidental to Linux at this point. Linux is a collaboration between dozens of giant corporations many of which actually hate the GPL. The success of the Linux Foundation is in convincing all those companies that the viral nature of the GPL has been neutralized and it is safe to use it in their proprietary products. There is no idealism remaining but Linux is still an amazing achievement that benefits us all.
chappi42 2 hours ago 1 reply      
I almost didn't read to the end (got annoyed). Isn't it wonderful that 'Linux' is able to be friends with companies and welcome them?

So many slanderous remarks, 'you can abuse Linux', 'the Linux Foundations has no ethical concerns', 'work with the community, and not against it'. And shouldn't there have been mentioned that VMware afaiu won a (first?) court case?

The questions are important of course. Here a imo much more fruitful reading, likely well-known: https://lists.linuxfoundation.org/pipermail/ksummit-discuss/...

throw2016 3 hours ago 2 replies      
A lot of the Linux bodies with community sounding names like Linux foundation are actually groups of corporate interests. I had looked up some of the people behind these foundations a couple of years ago and they came across as careerists with little connection to Linux, open source or Linux advocacy.

There is nothing wrong with this, but the names of these bodies should reflect their interests. A name like 'Linux Foundation' should be the flagship organization promoting Linux and open source globally as well as user and developer interests. Some thing like 'Linux Industry Group' is a better reflection of the work Linux foundation does.

Ultimately its high time end-users organize some kind of funded entity to support open source projects and protect user interests. Not just one but multiple such bodies. Without that the interests of those funding projects, developers and industry bodies will triumph and you will just have to accept what ever is 'decided' for you.

I think forums like this should also gently encourage all the thousands of startups who dip into the Linux ecosystem pool but do not contribute anything back even after they are successful, sometimes wildly. Github and Redis comes to mind. That doesn't feel right.

Hiring people working on specific open source projects is good, but its not support, its protecting your own interests. Non-conditional funding so they can continue to do their work that you have benefited from is better.

Its like a river, if you keep on taking from it eventually there will be nothing left.

mtgx 3 hours ago 2 replies      
> It sends the following messages:

- you can abuse Linux, the GPL and copyleft while still being accepted amidst the Linux Foundation Members

- it means the Linux Foundations has no ethical concerns whatsoever about accepting such entities without previously asking them to become clean

- it also means that VMware has still not understood that Linux and FOSS is about your actions, particularly the kind of choices you make how to technically work with the community, and not against it.

I think all of those apply to Microsoft, too. The Linux Foundation tainted its image when it accepted Microsoft without even demanding that it "becomes clean," as the author says, in regards to its exploitative use of patents against OEMs that dare to use Linux-based operating systems.

bykovich 4 hours ago 10 replies      
I gotta say that I think this is just not a big deal for most people. I understand that FOSS ideals have appeal, I understand why people would like them to be protected -- but at the end of the day, I do feel like FOSS advocates, as well-meaning as they are, are perhaps investing their concern in the wrong place. FOSS rhetoric is nice, but it just doesn't seem that important.
Zebras Fix What Unicorns Break medium.com
67 points by uptown  2 hours ago   12 comments top 7
thehardsphere 1 hour ago 0 replies      
> 2. Zebra companies are often started by women and other underrepresented founders.

Is this really the number 2 reason? Like, is the suggestion here if we got white men to start Zebras then Zebras would be more popular?

The other obnoxious thing is that they don't actually provide evidence for this; they cite that very few women get VC money and then claim that it's because they're running Zebras. And it's not relevant to say they start "30 percent of businesses" because I doubt that's 30 percent of software businesses.

endymi0n 1 hour ago 1 reply      
Not convinced. By a lot of their definitions we're a Zebra (profitable, no VC, using 1% of our gross revenue to make a positive impact, not aiming for a 1 bn valuation) - but for my taste, their manifesto is way too much feel-good, self-limitation and mediocrity. It seems like a knee-jerk reaction to overhyped unicorns, but not much more. Why does it have to be exactly one way or the other? Although we're bootstrapped, it doesn't mean we're a "lifestyle business", we can still (and are currently) disrupt/ing an entire industry. If you look at Google or SpaceX as an end result, you might realize that ambitions and being a great employer that makes a positive impact on the world aren't mutually exclusive.
themgt 1 hour ago 1 reply      
Gotta say it's a little ironic their Zebra's Unite DazzleCon[1] is itself being held in Unicorn-central San Francisco. Why not like ... Iowa? Not exactly helping the accessibility for real businesses nationwide by hosting this in SF.

[1]: https://www.zebrasunite.com/dazzlecon/

noobiemcfoob 1 hour ago 0 replies      
I've been feeling a lot of this myself. This corporate balance can't last. People will strive to merge their outlooks and philosophies with their day to day actions, so companies that don't help satisfy that thirst for something more out of life in their employees will have trouble. Plenty of people just want a punchcard, but I doubt it crests 30% in a secure population.

Just what that alternate model might be... I'll keep searching. In the meantime, co-ops are promising. There aren't many tech co-ops; most seem to be farms. But maybe.

"Companies We Keep" - an introduction to co: https://www.amazon.com/gp/product/B005KTT65Q/ref=oh_aui_sear...

mperham 56 minutes ago 0 replies      
Sounds like they are talking about B Corps.


r00fus 1 hour ago 1 reply      
I'm all for it. But how do Zebras compete against Unicorns? Especially when Unicorns have VC magical funding supporting them...
rdiddly 1 hour ago 0 replies      
Their values are in the right place, though I think a glance at the "bigger picture" is warranted. (Ugh, not that again! Yeah sorry.)

Literally most of the American economy, most of the GDP, used to be made up of "zebras," maybe still is. People sensibly and boringly making dependably useful goods and performing useful services at reasonably transparent prices.

What's great about that from an investor's point of view is that you can buy in and expect a reasonably solid rate of growth, especially when you optimize for the long term and ignore the short-term fever-dreams and diaper-crappings of the market.

Nowadays though, that part of the economy (the real part, the "zebra" part) has been faltering for several reasons I won't get into. That means you can't count on those steady or even decent returns in traditional places anymore. Which in turn means there's a lot more money out there circling the block, looking for a place to park where it won't lose value.

That is how the Silicon Valley venture capital ecosystem exists at its current size. When rich people can put their money in traditional places like refrigerator manufacturers, and confidently earn a decent return, they don't (not as much) go looking so hard for something better. They might take a small bit of their their portfolio and play around with it, but there won't be a huge "bubble" like you've got now.

SV is a market all right, only the good being "demanded" (by VCs) and "supplied" (by startups) is "growth potential." High demand in that market is possible only because the rest of the economy, the zebra economy, is so lackluster. And because everyone plausibly believes computers can extract value from a system in a disproportionate way that might pay off --- which is true by the way!

Anyway, all that easy money looking for a place to park, means at least some of the deals are fake and some of the startups are fake, and most of the valuations are fake, and all the fakery leads to all sorts of rampant doucherie to use the French term.

Nevertheless, She Coded dev.to
83 points by lmcnish14  5 hours ago   78 comments top 13
bobbington 1 minute ago 1 reply      
I'm going to drop a bombshell here...

Maybe women just don't like coding as much as men and that's why they are "underrepresented".

I'm going to drop another bombshell...

Maybe it's OK that women focus on other things they prefer because specialization of labor is a good thing.

Crazy thoughts, I know.

I'm going to drop the biggest bombshell right now...

Men have a Y chromosome, and women don't. The effects on a person are huge from physically different body parts to powerfully different hormones. Maybe our rabbid push for equality has failed to account for differences that truly exist between male and female... Maybe if we took that into account we would seek to find the best job for each individual, not try to make women act like men and disparage those who don't.

Maybe if we acknowledged the benefits of gender differences we would never have come to the mentally ill conclusion that two males make a great marriage couple.

studies show that women have more money and power today and yet are less happy.

Let's let women be women. Then instead of expecting them to protect themselves because they are no different than men, we would treat them with special honor.

zephyrthenoble 5 hours ago 1 reply      
I'm frankly quite surprised by the initial wave of comments disparaging this message.

The facts are that women are poorly represented in the tech community[1], and do make less than men[2]. Any attempt to let women feel more accepted and bring about much needed change should be championed, not picked apart and belittled because you feel like you are personally being attacked when people are just asking for help.

1. http://www.usatoday.com/story/tech/2014/05/28/google-release...

2. https://www.cnet.com/news/biggest-pay-gap-in-america-compute...

chrisbrandow 2 hours ago 0 replies      
I just had a conversation with a friend of mine today about the issues she is facing as a "woman in tech". It was some stuff that really surprised me. On today of all days.

Treatment of women as full-fledged peers is definitely something that needs better attention in our industry.

curyous 5 hours ago 2 replies      
The whole premise of this article as espoused in the first sentence, seems rather divorced from reality, to me.
Pigo 5 hours ago 1 reply      
"continued pay disparity"

here we go...

cholantesh 4 hours ago 0 replies      
Why was this flagged?
dijit 5 hours ago 5 replies      
Virtue Signalling and attempting to divide the genders in the opening, before the quotes from women in tech.

I expect better from The Practical Dev, you don't have to paint men in a derogatory light to bring up women.

dethswatch 5 hours ago 2 replies      
Warren violated the rules- it has _nothing_ to do with her gender.

Whether her points are validate or not, the rules were followed as they would have been with any man.

If she wants to get her opinion out, she can hold a press conference outside the Senate or simply call Maddow, like she did- a friendly audience that would allow her to speak as long as they had time for.

elastic_church 5 hours ago 1 reply      
Don't recruiters keep stats on the offers and demographics that people get?

I had heard wage gap wasn't an accurate disparity in software engineering jobs. Like there isnt a cabal of people at every company conspiring against equally qualified candidates based on gender.

In HIRED's report it seemed more common that people underbid themselves, and more often than not the company still gave people higher offers if they had underbid but these were still lower offers than for people that bid higher or overbid.

Lets work on it but we have to get the discussion right first. I think villifying a sexist boogeyman isn't going to get us anywhere if a persistent reality is more nuanced.

Zikes 5 hours ago 1 reply      
> Gender inequality has permeated the technology and computer science fields since their earliest beginnings.

It's true. The earliest "computers" were overwhelmingly women.


pkd 2 hours ago 3 replies      
I personally don't find anything flag worthy in this. Can the mods please unflag this?
tzs 9 minutes ago 0 replies      
> Senator Elizabeth Warren (D-MA) had arrived on the Senate floor on Tuesday, Feb. 8, 2017 to debate the confirmation of attorney general nominee Jeff Sessions.

Feb 8, 2017 was a Wednesday, not a Tuesday.

dijit 5 hours ago 2 replies      
I'm desperate to push the "women aren't victims" narrative.

If we tell them they're victims enough then they'll believe it. There is evidence of this in other things for instance refugees who are told that they're victims are less likely to integrate.

I'm more focused on pointing out that we're all equal. If you're a woman on my team, I'm incredibly sorry but I'm not going to celebrate your feminity any more than I'd celebrate my other colleagues manliness. You do your job and I'll reward everyone with good pay, a bonus and a cake or two.

Ask HN: What does your production machine learning pipeline look like?
228 points by bayonetz  6 hours ago   54 comments top 13
bradneuberg 2 hours ago 6 replies      
(ML engineer from Dropbox here)

For deep learning oriented projects, we train on EC2 GPU instances, generally p2.xlarge instances these days to get the Nvidia K80 GPUs. We can spin up many of these in parallel if we are doing model architecture searches or hyperparameter exploration.

We have an in-house data turking setup where we can efficiently roll new UIs to get ground-truth data for given problems, generally in the thousands to tens of thousands of real data. We also use data augmentation where possible, synthetically generating millions of example data points, combining it with the real turked data for fine tuning. Note that we never look at or train with real user data, unless explicitly donated, so data efficiency is important to us.

We've standardized on TensorFlow these days, doing inference on CPUs currently on Dropbox's compute infrastructure. We have a jail setup based off of LXC and Provost that allows us to safely execute these trained models in a contained way, responding to user requests as they come in. We use standard distributed systems plumbing for RPCs, queues, etc. to respond to user requests and route them to trained models. We version our trained models, and have an in-house experiments framework that we use to deploy new models and test them against subsets of user traffic to see how they are doing as we ramp up a new model.

Most of our day-to-day work is in Python, with occasional use of C++; other parts of Dropbox sometimes use Go and Rust, though we haven't had need for that on the ML team. Note that Dropbox is one of the largest users of Python in the world (Guido van Rossum actually works here).

BTW, the Machine Learning team at Dropbox is hiring. Come join us! Details: https://www.dropbox.com/jobs/listing/533100

angusb 4 hours ago 2 replies      
Fraud detection at GoCardless (YC 11). We use the same tech for training and production classification: logistic regression (sklearn).



- train on an ad-hoc basis, every few months right now moving to more frequently and regularly as we streamline the process

- training done locally, in memory (we're "medium data" so no need for distributed process), using a version-controlled ipython notebook

- we extract model and preprocessing parameters from the model and preprocessors that were fit in the retraining process, dump to a json config file in the production classification repo


Production classification

- we classify activity in our system on a nightly cron*

- as part of the cron we instantiate the model using config dumped from the retraining process. This means the config is fully readable in the git history (amazing for debugging by wider eng. team if something goes wrong)

- classifications and P(fraud) gets sent to the GoCardless core payments service which then decides whether to escalate cases for manual review


* We're a payments company, processing Direct Debit bank-to-bank transfers. Inter-bank Direct Debit payments happen slowly (typically >2 days) so we don't need a live service for classifications.

Quite simple as production ML services go, but it's currently only 2 people working on this (we're hiring!).

gallamine 2 hours ago 0 replies      
We wrote a blog post detailing a lot of how we do our streaming ML system at Distil (https://resources.distilnetworks.com/all-blog-posts/building...)We classify web traffic as humans or bots in realtime.

Scoring: Essentially raw logs go to kafka. Java processes then read raw logs and aggregate the per-user state in real-time. State is stored in Redis. When a user's state is updated it is sent into another kafka topic that is the "features" topic. Features are simultaneously saved to HDFS for future training data and then consumed by a Storm cluster for doing prediction. Storm is running pickled scikit-learn models in bolts that read in the features and output scores. Scores are sent into a "score" kafka topic. Downstream systems reading the scores can read from this kafka topic.

Time from receiving a log to producing a score is ~x seconds or so.


Training data is stored into HDFS from the real-time system, so our training and production data is identical. We have IPython notebooks that pull in the training data, build a model and sckikit-learn feature transformation pipeline. This is pickled and saved to HDFS for versioning. The storm cluster when it starts a topology loads the appropriate model from HDFS for classifying data.

antognini 4 hours ago 1 reply      
My case is probably not quite what you're looking for, but I'll describe it anyway. I develop models for automated EEG analysis. The models are all in a standalone program that neurologists use. Their computers may or may not be connected to the internet, so all computation has to be local.

I have a set of training records (usually on the order of ~100). I do three folds, and train models on 2/3 of the data, validate on the last 1/3. I train using Tensorflow, usually experimenting with a few hundred different architectures. Then I combine the best models we have trained on the three folds into an ensemble. The ensemble can be too big to be practical, so sometimes the ensemble is distilled to a smaller model by training to the ensemble outputs. (The models have to run in real-time, and there are a lot, so it should take any particular model no more than ~10ms to process 1 second of data.) The final model is then tested on a completely independent dataset to determine our performance.

As these models are training I develop different experiments to try to probe the behavior of the models to make sure that they're working as expected. This usually motivates new architectures or features to experiment with.

Once the model has been trained, I export all the model weights to a bunch of big static C++ arrays. I've written my own feed-forward NN layers (fully connected, convolutional, and LSTM) to use these arrays of weights in our C++ code. (Tensorflow Serving didn't seem to make much sense for our use case and it was just easier to write the basic NN layers.)

When our resident neurologist feels like it's working well enough, we apply to the FDA for approval, and, who knows, maybe in a few years it'll be allowed in the product!

viksit 4 hours ago 1 reply      
Conversational ML Pipeline (Myra Labs).

We have a few interesting challenges. Our system is set up to create models for a variety of customers, and thus needs to be able to build, update and serve a couple of thousand models at any given time. Since we need GPUs, this has required some custom work rather than being able to piggy back on existing frameworks. We leverage Tensorflow for most things.

- We have a distributed job processor cluster running on GPU nodes. All models are trained on this. We wrote a custom framework in Python for this. It can train any kind of model (keras - tf/theano, sklearn, et al).

- We store our models on S3. Versioning of models becomes important, and we looked around to see how others do it before we rolled our own.

- We have a serving system that is a heavily modified/forked version of Tensorflow Serving, written in C++. Again on GPU machines. Why C++? Because we can leverage binary model formats and reduce memory usage/make things more efficient. The reason we forked TF Serving almost a year ago was that we had a large number of features we needed that TF's rapid versioning cycle/breaking API updates just didn't let us do. This system is able to load and unload models dynamically from the S3 source, as well as distribute them across nodes and balance queries into them. It can load all TF graphs, but also supports running other kinds of models through callouts to other "engines" (eg, if we want to use sklearn or CNTK in the future).

poorman 3 hours ago 2 replies      
I find a lot of these frameworks to be overkill.

A rating event comes into our web server (python), then sent to SQS, then the event is pulled by our custom artificial neural network (deep learning) script, --written in Go. A model is trained and serialized. Next, the serialized model is uploaded to Postgres where it is fetched by the web service (also written in Go) to serve predictions.

We update our models within 15 seconds of a user rating. Every month with millions of ratings, we re-train millions of models and serve billions of predictions.

isoprophlex 3 hours ago 0 replies      
Thanks everyone for the useful answers.

I'll go too: We have 20M customers with a history 1-1000 activities per customer, updated once daily. We do nightly runs of a collaborative filtering algo to produce per-customer suggested activities, and push this to a sql database. Business users can load recommendations for each customer from this db...

andy_wrote 1 hour ago 0 replies      
At Custora (YC W11), analyzing consumer behavior and presenting the insights through a web-based interface.

We have an in-house system for modeling DAGs of statistics, perhaps like Airflow (I've only read about Airflow, never used it, so can't comment more there). Computed values at different nodes in the DAG can be cached by their arguments. So for example, a fitted model would be a very logical node to cache, as you'll have many other potential statistical requests that would depend on the model.

Refitting the model would entail either clearing that node; if you changed the inputs to the fit, a separate stat would be cached and other dependent stats would know the difference and point to the right stat depending on the arguments you passed.

A lot of our models are Bayesian in nature, so generating predictions is typically a two-part process: training parameters, which is slow, can happen infrequently and which need not critically include all of the latest data, and applying the posterior update, which is faster and which needs to be redone on every data update. (We import data in batch.) Retraining is somewhat ad-hoc right now, although we've got an active project on the docket to systematize this and produce streamlined before-and-after comparisons.

Computation work is dispatched to EC2 instances by a Redis-based job manager (qless, developed by Moz, formerly SEOmoz). We do the stats work in R, in-memory. For things like order and transactional data this is feasible even for relatively large retailers, but we're gradually looking at involving Spark more so that we have the capacity for larger analyses. (We already do use Spark for some non-ML tasks, like customer data import.)

gidim 4 hours ago 1 reply      
We use scikit-learn to train the models every few weeks when we get more labeled data. Once a model is trained we use joblib to save the entire pipeline (normalization, feature processing etc). In production we have a thin Rest wrapper that loads the model pipeline to memory and serves prediction requests. We scale the number of these servers based on the load.
pilooch 4 hours ago 0 replies      
Deepdetect for both dev and prod, with the minimum code in front of it. This setup is definitely not able to accommodate all modern ML needs, but the fast and secure model update from dev to prod is the easiest for us. Disclaimer: DD author so the bias is very high, apologies for this, maybe my comment will remain useful to some.
deepnotderp 3 hours ago 0 replies      
Training/retraining: Done arbitrarily, mostly locally, sometimes distribute. Done with either TensorFlow or Torch. We have a custom backend for it. Often times linked with Keras.

Inference: For our custom platform, we have our own framework (in progress). For other products where our hardware is unavailable, we will use either MXNet mobile or a custom framework on mobile frameworks. For deployments where we have the luxury of a cloud link, we will either use TensorFlow serving (with a custom backend once it's done) or Flask linked to TF/Caffe/Keras (also with a custom backend once it's done).

gumby 3 hours ago 1 reply      
Can anyone comment on the relative computational requirements of training vs using the classifier? antognini described a system that trained in tensorflow and then is used by a simple C++ program (not on a GPU?).

Do you use the same resources/hardware for both training and executing?

d4rth 3 hours ago 1 reply      
For you feedback I'm offering karma or bug bounties (if you can identify me on github)!

Forestry High Resolution Inventory - Predicting Forest Attributes at Landscape Scale----------------------------------

This a really promising business line. Our pipeline is not very automated and it has no application level data management - we're a ways away from that (e.g. the code isn't even under version control).

For our current business and data volumes, the system works.

We get target attributes (forest characteristics, e.g. species composition) by doing field surveys or other methods (e.g. expert photo interpretation).

We acquire Lidar, Color Infrared data, and Climatic models to develop landscape-scale features. For the lidar derived features we use LAStools. We use Safe software FME for generating some features from the color infrared data (e.g. vegetation indices). We use regional climate models suitable for the area of interest for the climatic indices.

The reference data and the target attributes are spatially joined. We end up with a lot of features and use the subselect library's "improve" function in R in an iterative fashion to reduce the number of features; leave-one-out LDA is used to assess the performance of the subset of features. If the target variable is not categorical, we have to classify it in order to run the LDA. The procedure produces a lot of candidate feature sets; our process to select a particular feature-set is human driven. We do not have a formalized or explicit rule. The subset of feature chosen are used in KNN routine. Some components are in python and state is shared using the filesystem.

We do a lot of tuning on a project by project basis.

At the end of the prediction, there are several transformations to derive other forest characteristics and munge the data into regional standards.

The whole modeling process is done on a somewhat high powered desktop. There is one directory that holds all the code and configs; scripts are invoked which pull settings from a corresponding config. Some of the code is stateful (i.e. stores configuration) and the configs are global (their location is not abstracted) so in order to run this process concurrently, it has to be deployed in separate machines.

Municipal Sewer Backup----------------------

We ported some of the components described above to predict sewer backup risk. Key components were broken into R and python libraries and dockerized. The libraries are here pylearn - https://github.com/tesera/pylearn and rlearn - https://github.com/tesera/rlearn.

A python library (learn-cli, which we haven't open sourced) uses r2py to coordinate/share state between the two libraries. The process for training still requires a user to select a model from all of the candidates that the variable selection routine produces. This selection is made a prediction time; all the candidate models are stored in one file and an id is specified for which one to use. learn-cli is dockerized and we have it deployed on ECS. It scales pretty well.

This solved many of the challenges in the forestry pipeline, but we haven't been able to bring everything from the forestry model into deployment this way due to a gap in data science and development. I've been looking into Azure Machine Learning as a possible solution for this. I have benchmarked some builtin models there and gotten identical performance as with our highly customized process.


Would love to hear your advice for formalizing & automating, or alternatives to, our process for the forestry model pipeline.

Also, if you have a highly automated machine learning pipeline - what are your data scientist responsibilities? It's not clear to me how our data scientist jobs would evolve if they didn't have to manually run several scripts for fit and select a model and generating the predictions.

Softbank to sell 25% stake in ARM Holdings to Saudi-backed investment group bbc.co.uk
94 points by dan1234  10 hours ago   53 comments top 7
awinter-py 8 hours ago 5 replies      
The saudis are desperate to use their sovereign wealth fund to diversify (i.e. save) their economy.

The country has a lot of problems but I'm legitimately excited to see what they come up with. The first step to getting out of a hole is to recognize you're in one, and they have.

With any luck, economics-driven policy will lead to social change as well.

grabcocque 9 hours ago 1 reply      
I think the headline is misleading at best.

It's transferring 25% of its stake into fund into which it has joint ownership. The headline makes it sound like its dumping its investment.

The mods changed the BBC's clickbait to something more accurate. Thanks mods!

petra 8 hours ago 3 replies      
So, does the availability of almost infinite oil money(i know, maybe not a realistic assumption, just an hypothesis) change the chances of ARM overtaking Intel in servers or PC's ?
kbumsik 56 minutes ago 0 replies      
I was always wondering why Softbank bought ARM. ARM is of course market leading processor design company but it is almost solely fabless and intellectual property company. I don't see how ARM would benefit Softbank.
bargl 9 hours ago 0 replies      
Softbank also just started a merger of Intelsat and OneWeb. https://techcrunch.com/2017/02/28/oneweb-is-merging-with-int...
darkhorn 4 hours ago 0 replies      
Saudis own stakes of Turkish Telekom. And guess what telekom company is worse in Turkey.

Saudis only milk. No research or development.

technological 6 hours ago 1 reply      
Thats nice. For anyone wondering the Saudi-backed investment group is called "Mubadala".

Mubadala owns 100% of GlobalFoundries ( which brought the hardware division from IBM)

Abacus use can boost math skills kottke.org
103 points by Osiris30  12 hours ago   57 comments top 16
JamilD 9 hours ago 1 reply      
I find play to be the most effective way of learning.

When I was learning, for example, the backpropagation algorithm, no amount of lectures could've helped me understand better than just drawing a network, thinking "what happens when I change the weights here?", and toying with the equations. It gives you an intimate familiarity and understanding that you can't get anywhere else.

racl101 18 minutes ago 0 replies      
It's a great way to understand our base 10 number system.

That definitely gets lost when using calculators and computers.

As well, I have to believe it would help a child understand any other number system such as binary, octal and hexadecimal.

meow_mix 4 hours ago 2 replies      
Maria Montessori had a lot to say on the importance incorporating tactile elements in early education. This is why Montessori schools generally incorporate tools like the abacus (at least mine did). Not surprised we're re-learning some of her findings today
myth_drannon 6 hours ago 0 replies      
Related to abacus use.Sam Harris podcast "Complexity & Stupidity" https://www.samharris.org/blog/item/complexity-stupiditymentions abacus skills. The skill "spreads" to other areas of the brain they call it "complementary cognitive artifacts"

Quotes :

Harris: What else would you put on this list of complementary cognitive artifacts?

Krakauer: The other example that Im very enamored of is the abacus. The abacus is a device for doing arithmetic in the world with our hands and eyes. But expert abacus users no longer have to use the physical abacus. They actually create a virtual abacus in the visual cortex. And thats particularly interesting, because a novice abacus user like me or you thinks about them either verbally or in terms of our frontal cortex. But as you get better and better, the place in the brain where the abacus is represented shifts, from language-like areas to visual, spatial areas in the brain. It really is a beautiful example of an object in the world restructuring the brain to perform a task efficientlyin other words, by my definition, intelligently.

hedgew 9 hours ago 0 replies      
>Based on everything we know about early math education and its long-term effects, Ill make the prediction that children who thrive with abacus will have higher math scores later in life, perhaps even on the SAT

Also not so suprisingly, children who thrive in early math education will tend to have higher math scores later in life.

That's about all we seem to know about early math education anyways.

neves 5 hours ago 2 replies      
Do you have references about how to use an abacus?

I just know how the obvious uses: counting, summing and subtracting. There is a famous story about Richard Feynman and the abacus: http://www.ee.ryerson.ca:8080/~elf/abacus/feynman.html where the guy makes a lot of difficult calculations.

robbiewxyz 3 hours ago 0 replies      
I absolutely see this. Just today I've been helping teach a friend's child to do multiple-digit multiplication on paper. In that kind of thing, understanding place values (the ones-tens-hundreds columns) is so important. It's amazed me how many high school students even don't quite get the idea of place values and what base 10 means. The abacus requires a good understanding of that concept from the start.

So, while the "skill spreading" might also contribute, with the abacus, it's a basic and essential concept to really understand for all higher-level math.

jimmies 7 hours ago 2 replies      
It seems like the things that are more painful to learn and use make your brain work harder, and thus make you learn better. Or, maybe it is the case that people who know to use the hard stuff are interested in the subject enough to learn how to do it the hard way. You wouldn't be surprised that those who use Assembly to program tend to have better programming skills compared to those who only can program in Visual Basic, would you?

Veritasium also mentioned this recently: https://www.youtube.com/watch?v=UBVV8pch1dM

The more interesting question to ask to me is (1) whether it is the abacus that makes children learn better, or it is just that children who choose to use the abacus learn better (2) whether teaching abacus use at the beginning has the same effect as teaching abacus use later on after the students already know how to use the calculator. If children who choose to use the abacus learn better, then it wouldn't surprise me, but it means that teaching abacus wouldn't help. If that is false and (2) is true, then we know we better off teach abacus (or assembly) -- it doesn't matter when. But if (2) is false, it means that we have a huge trade-off to consider. Because either we teach the hard stuff at the beginning and discourage a lot of students, or we don't and have worse learning outcomes.

retox 7 hours ago 0 replies      
Probably off-topic but I checked Isaac Asimov's "Realm of Numbers" from my library on the recommendation of HN and it's great. One of the earlier chapters outlines how the abacus and its concept of 'the unmoved row' could have helped forment the idea of zero. Very interesting text, no idea if the ideas are considered incorrect these days but still entertained and educated. 10/10 read.
hammerandtongs 8 hours ago 0 replies      
So from the hour I wasted yesterday poking around the web after reading this article -


Seems like the best free resource.

This book seems like the standard english text on soroban -


The first few chapters are great. This book from the sixties mentions that abacus are actually a middle-eastern invention and came east with trade.

Abacus in fixed frames are a later invention for speed but earlier versions would have more resembled a backgammon table using pebbles.



This led to this pretty interesting paper -


It seems like there is a pretty large gap in our understanding of how people actually did calculation?


ideonexus 8 hours ago 3 replies      
I see these kinds of benefits in over-learning keyboards as well. The few classes I've taught to children and teenagers, you can see a dramatic difference between the students who are familiar with keyboards and those whose parents--I speculate--don't allow them screen time.

This extends into adulthood. The people I work with who have that well-practiced familiarity with keyboards and keyboard shortcuts--using them as if extensions of themselves--easily adapt to any new software interface. The adults who hunt around the screen with a mouse have a great deal of difficulty with change and are almost helpless when confronted with a new interface--even if the menu items are all in the same places as the last software.

It's a comfort and familiarity with the computer that allows the user to easily adapt to new things. It totally makes sense to me that working with an abacus would have similar benefits for making students comfortable with math.

forkandwait 2 hours ago 1 reply      
Can anyone recommend an actual abacus to buy?
partycoder 8 hours ago 4 replies      
In my opinion the best abacus you can get is the 5 beads per rod one (soroban).

The 7 bead one pictured in the article header is slower to work with (classic suanpan).

The 10 bead one used in the west for kids may be good to have an intuition about quantities, but for practical use it is the worst.

You can get a soroban, learn the basic algorithms, then use an soroban drilling app to gain speed. With practice, you can exploit muscle memory and stop using a physical abacus.

The abacus provides a workable mental model for numeric operations. In contrast, calculators are opaque machines. They take inputs and provide an answer but do not expose their inner workings in a way that can be assimilated and learned by the user.

ForHackernews 7 hours ago 2 replies      
There's a cute story in Surely You're Joking, Mr. Feynman where a man with an abacus challenges him to an arithmetic contest: http://www.ee.ryerson.ca/~elf/abacus/feynman.html
n00b101 7 hours ago 3 replies      
The American elementary and high school education system would do young students a huge favour by abandoning the use of electronic calculators. I have no doubt that the abacus is a superb pedagogical tool but simple pencil and paper calculation would be a huge improvement over the calculator culture in early mathematical education.

I majored in Applied Mathematics in university and did not use a calculator in a single class or exam - it was explicitly forbidden to use them.

But I was also educated in the American elementary and high school curriculum. I am thoroughly convinced that the use of calculators in the American system does a huge disservice to students. The calculator culture begins at an astonishingly young age, in elementary school, when kids are first introduced to the Texas Instruments TI-108 calculator [1]. This bright blue and red, solar-powered gadget is very exciting for children in a classroom, but it is poison. At the very age that students should be drilling mental math, they are instead given this huge crutch. It's like never taking the training wheels off a child's bicycle, robbing them of the opportunity to actually learn to ride a bicycle.

The same story is repeated in middle school (with the same TI-108). And then you reach high school. You would think that at least now the the training wheels would come off. But instead, you are "upgraded" from bicycle with training wheels to an adult-sized tricycle. I am, of course, speaking of the Texas Instruments "graphing calculator" [2]. This abomination is a full-fledged programmable computer with CAS software. You are encouraged (even required) to use this machine all the way through Advanced Placement Calculus. If you are not distracted in class playing video games on your "graphing calculator," then you may use it to automatically solve algebraic and trigonometric equations and evaluate derivatives and integrals. Why bother memorizing identities when you can just program them into a computer? If you have been programming from an early age, then you may be in the worst place possible because you are predisposed to wasting your time trying to program your way out of math homework instead of focusing on learning math. When students complain about why they have to learn Calculus, your American math teachers tell you that nobody "in the real world" actually solves integrals by hand and it's all done on computers, but you have to learn it anyway for obscure pedagogical reasons. An aspiring future Comp Sci student is likely to take this as a clear signal that it's all a waste of time, and soon high school will be over and you'll never need to worry about this antiquated Calculus thing again - your computer will do it all for you.

There are other ways in which American mathematics education is extremely flawed, but the calculator culture is the worst offence by far. I suspect that the bureaucrats who decide on such things rationalize this by calibrating the curriculum to what they see as the "lowest common denominator" student who will never need to learn anything beyond basic algebra and arithmetic. The goal of the system is to train this student to be a productive, low-wage worker, with just enough mathematical training to be able to mindlessly punch numbers into a cash register or simple spreadsheet on the rare occasion that the need for numerical computation may arise in their "careers."

I have a three year old and I have no idea what I could do to save him from this insane system, short of moving to a different continent.

[1] https://en.wikipedia.org/wiki/TI-108[2] https://en.wikipedia.org/wiki/Comparison_of_Texas_Instrument...

jweir 3 hours ago 0 replies      
       cached 8 March 2017 23:02:01 GMT