(let a 1 b 2 c 3 (+ a b c)) -> 6.
Other than that he's probably just enjoying spending quality time with his young family after a relentlessly busy few years.
However, he is very active on Twitter
The first problem is that mobile devices are pretty much inherently asynchronous. There are apps that you would use at the same time as another person (like real-time games) but especially on cellular, lag is an issue. This pushes people into designing products that can tolerate lag measured in seconds (because that isn't shockingly bad performance on cellular networks for apps that use your standard off-the-shelf tools like REST/HTTPS/AWS for example). This produces a lot of asynchronous, or semi-asynchronous applications.
Now partly due to those product designs, and partly due to people having lives, they use these apps asynchronously. You pull out SnapChat, fire off a message, and go back to reading Reddit or whatever. Snapchat is off. There's no way to reach you.
Okay, so why don't we run SnapChat in the background? Well there are layers of reasons. The first layer is that it costs energy, and the mobile revolution is in large part possible because software and hardware developers got very aggressive about energy management. If we ran things in the background like you do on your laptop it would need to be as big as your laptop and plugged in regularly like your laptop. There are also practical problems, like announcing every time your network address changes, or even figuring out when your network address changes, which is hard to do passively. I'm glossing over some networking details but there's a deep principle here that within the design of existing TCP/IP/cellular stack you can't have reliability, energy-efficiency, and decentralization. You must pick 2.
Apple, very presciently IMHO, has decided to legislate a lot of rules about background processes that I don't have time to go into here but basically they try to regulate the types of work that apps can do in the background to only the energy-efficient kind. The rules are actually pretty well-designed but they're still rules and they restrict what you can do. Android doesn't have this limitation but unless your product is Android-only you're going to comply with the iOS rules when you design a feature that communicates across platforms.
Okay, so we can't run Snapchat in the background. But what if two users happen to have it open? We can use p2p then right?
Well sure. But the user may be on a network that blocks p2p traffic. That is their network's fault, but they still e-mail you to complain, and leave bad reviews for your product, because as far as they can see "the Internet works" so it's your app's fault.
So what you do is you design a scheme that uses p2p by default and client-server as a fallback. There are actually apps that work like this. Problem here is, instead of getting support tickets about it not working, now you get support tickets about it being slow.
And there are ways to solve this, like permanently giving up on p2p after a certain numbers of failures for example. But the first experience is still pretty bad, which is what counts in mobile. And I remind you, this p2p feature is already a scheme that only works in the 0.3% of cases that users actually have the app open at the same time, and now you want to add code that disables p2p in even more cases than it's disabled already. This process continues until basically zero actual customers will ever use the feature.
And we haven't even gotten to cases like "Why didn't this message get delivered to all my devices?" because there is just zero chance that any customer, anywhere, will have all his devices turned on at the right time to receive incoming p2p connections.
Now non-messaging products like Spotify or Netflix are more plausible, but you still have to ask who wins here. Customer experience is worse, both because of connectivity problems and increasing bandwidth bills and the energy efficiency losses that comes with rebroadcasting content to other users. Developers are worse because they probably have to build both client-server and p2p architecture, because p2p isn't reliable enough on its own. Support is worse because almost any issue is potentially a p2p-related issue, have you tried disabling p2p and seeing if the issue persists?
There's really no reason, certainly no compelling business case, to inflict that much pain on any mobile product I can think of. I mean, there's probably a place where p2p makes sense--we live in a big world--but in general it makes things much worse for everybody.
Not all networks support the ability to P2P network, or, if they do, they require intervention by the user.
So you have two issues: some users will never get to use your application, and for those that could potentially, they will likely need customer support to help them configure their network correctly.
Corporate networks are the worst for this. They aren't going to change their rules for your application (yes, they might, but don't assume that starting out).
A much greater percentage of home networks can support P2P networking, but your application probably needs to support STUN as well as UPNP.
Some number of home routers won't work ever, or can if you configure them correctly. And that's where it gets messy. Is it enough to tell a customer to 'go figure it out', when it pertains to router configuration? You might get away it with PC gamers; but I'd argue any other segment of people will have no idea how to do it, and need some help. So now you have to try figure out the enduser's home router configuration as best you can remotely. Huge drain on customer support resources, which in a small startup, usually means the developers.
So why go P2P with all these headaches? Unless you have a really strong reason to do use P2P, like keeping latency down between peers, you don't bother.
As an example skype years ago with mostly desktop/laptop clients was largely p2p (just login was centralized). With increasing numbers of tablets, WAN connections, and smartphones they switched to central servers.
So sure, you might be able to spend a man year and get an awesome, robust, and performant solution. But your competition will have spent that time actually making users happier and steal your market.
If we take your Snapchat example , if one user is trying to send a message to another and their app is closed or their device is off, where does the message go ?
On the desktop you have the luxury of mostly running in the background to pick up the ping, but it's almost never the case on mobile.
It's very hard impossible, really to deploy P2P technologies on a mass scale without thousands of users encountering problems with their routers & firewalls.
For mass market products, you can't get away with asking people to whitelist your app, make sure port 28777 is open for UDP, etc.
Many P2P systems have freeloader issues, which can usually be resolved/avoided/ignored on desktop systems, but when you add mobile into the equation with its paltry bandwidth limits and sky-high overage charges the potential for it to become a problem is much greater.
P2P is just hard, and not that pertinent for most products. And of course there is the network issue with special ports, UDP that makes these solutions sometimes impossible to deploy in the enterprise world.
Is that even possible?
So by the time a product becomes successful, it's core is already using inefficient solutions. And very few companies upgrade them because the effort is not really worth it business-wise.
If we would have been driven by real solutions instead of money, then P2P would probably be king. Direct communication everywhere.
P2P isn't completely reliable. There are many cases where you can talk to a server but you can't talk to a peer, ranging from evil firewalls to excessive layers of NAT to simple things like the target device being offline. Thus, you must code a fallback that talks to a server if you want reliability. This server fallback will work for all situations, so it's necessarily easier to just use it for everything and not bother to code the P2P bit at all.
P2P is also really hard to do well. It's pretty easy to do poorly: have one device tell the other device what its IP address is and a port to connect to, then connect to it. In practice, this fails about 99% of the time because approximately all consumer internet users are behind NAT these days. So then you enter the wonderful world of NAT traversal meaning you have to deal with horrifying things like UPNP, NAT-PMP, and STUN. And this is when both sides keep the same IP address throughout the connection! Now consider when your smartphone user goes from Starbucks, where he has WiFi, to the bus, where he only has LTE, to home, where he has WiFi again.
Bandwidth is cheap. Let's say it would take Snapchat one developer-month to implement this, or about $10,000. (I'd wager this would be a strong underestimate, both in terms of time required and the cost of that time.) Amazon S3, to take a random example, charges 12 cents per GB of outgoing transfer to the internet at lower use levels. You can buy 83TB with that $10,000. If your typical Snapchat image is 1MB (they're low resolution, right?) then you'd have to P2P 83 million images before you broke even on the investment. Factor in a more realistic timeframe, a more realistic cost, and the opportunity costs of not having that developer work on something more useful, and the payoff goes up by an order of magnitude or more.
P2P does get used where it pays off well. That's either high-bandwidth stuff or low-latency stuff. WebRTC does P2P whenever it can. Apple's FaceTime does P2P when they're not disabling that functionality to placate patent trolls. Skype does (or at least did, I seem to recall some changes) P2P for audio and video. And of course nothing beats BitTorrent for sending massive amounts of data to large numbers of people. But it just doesn't pay off unless you're really sending a lot of stuff.
MacOS had commercial apps and a good reputation in certain fields. The hardware was expensive by PC standards but high quality.
Linux was growing like a weed: adding just a little more RAM to the kind of desktop a college student could afford let you do everything that the Sun workstations in the computer lab could do, and you never had to share it.
If BeOS had been free, compatible with less-blessed hardware, and not offended Apple so much, it might have had a chance.
Linux took hold because you could pull any old PC out of the junk pile and load it up and go. BSD, at the time, was fracturing into multiple versions and having legal troubles.
Haiku alpha 4 is awesome. But there are too many other projects out there so there aren't enough resources to make faster progress.
The wikipedia page http://en.wikipedia.org/wiki/BeOS has a lot of more good information.
That's certainly not the only reason the company went under, but it was a major factor.
I think they made a lot of design decisions that were overfitted for 1995, so if they had survived they would have quite a bit of legacy cruft by now. And the brittleness of C++ may have made the cruft worse.
In some cases their idealism seems to have held back practical adoption. Their "pervasive multithreading" made it very difficult to port Java and Mozilla. (The idea of "only native apps, no ports" made perfect sense in the siloed PC market of 1993. By 1997, not so much.) Treating all developers equally meant that professional developers may have gotten shortchanged.
OS/2 was slightly more compatible with existing apps and therefore had slightly more success for a while.
It wasn't "better enough" than the alternatives.
It was better than NeXT (what would become OSX), but not enough to be revolutionary. It was better than Windows, Mac Classic, and Linux, but also not enough to be revolutionary.
The existing closed incumbents won out by market share and inertia, and Linux won out by being free, open, and by basically going viral.
If BeOS had been open and free I think it would have given Linux a run for its money, but it wasn't.
I got the job by a referral of an acquaintance from a local software Meetup.com group that I attend regularly. When I got frustrated with my last job, I approached the leaders of the Meetup group (who I sort of knew as acquaintances by that point) and said "Hey guys, I'm sort of looking around now. Let me know if you hear anything." A few weeks later, one of those guys made an email introduction to me of another guy who works at the company I'm currently at. He brought me in for an interview... at first, I was just kind of going to the interview as a courtesy - I was pretty sure I didn't want to work here - but when I met the guys and the boss I would be working with, I was sold pretty quickly. They seemed like really fun, intelligent guys when we met and they are. If there was a point in the interview that sold it to me, it was probably the point where I asked them if we should start the whiteboard coding portion of the interview and they waived it off, saying "That's just for people who we think are bullshitting us. We can tell you know what you're talking about and what you're doing."
I just got back from the beach for a week, and while it was nice to be away for a little bit, I was genuinely excited to get back in the office on Monday. I don't remember ever feeling that way at any other job I've had, so I guess I have to say this is my "Dream Job". That doesn't mean I want to do it forever - I'm a firm believer that even the "Dreamiest" job doesn't beat working for yourself, but I have to say I really enjoy it. Unless someone offered me something on the order of 2-3x what I make here, I wouldn't consider leaving (and I make a pretty decent amount for my region/experience).
What makes it that way?
1. Boss. Best/most competent guy I've ever worked for. He somehow has the "magic touch" of keeping the team focused on things we can deliver, calling bullshit on all the paper-pushers and meeting-mongrels that try to sap our time ("No one on my team has a company phone. No one in this company has the right to innterupt my devs while they're working."), and rolling up his sleeves and coding when we have to stay super-late to get something done (only happened once). He also judges on "body of work" more than individual incidents ("You went out at 2PM and get drunk with your co-workers yesterday? No problem, you usually get your shit done in good order and on time. You showboated about staying until 1AM last night? Bullshit - on a regular basis, you don't get shit done. Try harder.")
2. Open-ness of the team. "You have been messing around with a new library/framework at home for the last couple of weeks and you really like it? Come in and show it to the rest of the team for a couple hours tomorrow. Team discusses it, weighs benefits/drawback/long-term-maintainability...it's in production a week later." "You met an awesome guy at a meetup last week? Bring him in here and interview him. Two weeks later, he's hired."
3. "Goldilocks" company atmosphere. Not "douchy-SV-startup concrete walls and beanbag-chairs", but not "corporate cubicle farm" either. It's a great mix of people from different parts of the world, different age ranges / genders / etc. We have free sodas/coffee, but not free beer. People leave at 6. There is usually a happy hour every couple of weeks where we get together at a crappy bar and have a great time, but no one's social circle consists entirely of coworkers. If you're married/kids people tell you what happened at the happy hour the next day so no one feels like they're "left out".
I'm currently the founder of my own funded company, which in a way sounds like a dream job too, but because there are a lot of non-coding distractions and also huge stress levels, I don't think it's really the same. The cushy open source gig was the dream job. :) Maybe after I'm "successful" I'll go back to that.
Three and a half years on all I seem to do is fix the same excel upload errors (it was a temporary quick fix solution that has always been bottom priority for replacement), go to dumb meetings where they spend half an hour discuss the acronyms used in the menus, and change colour of items. I never get a chance to focus on anything that takes more than half a days coding, so basically all the interesting work is now replaced by trivial fixes usually where the users seem incapable of reading the error message (though with Excel it impossible to guess what the problem will be).
I was promised a promotion last year, but that hasn't happened yet (I work in Spain, and government cuts mean it not allowed until we merge with another institute). I have realised recently, that it is making me frustrated and deeply unhappy, so I am willing to take a pay hit (on my future wage) to do something that I actually enjoy again.
Ideally I want to go freelance, but there isn't such a big market here in Spain, and the common advice is to build up your portfolio (tricky when your work is in house).
Anyway, as I realised recently, do something you enjoy, and don't get sucked into the management style thing if you that isn't what you want to do. If management are not listening to your suggestions, it is time to get out. The promise of more money has kept me hanging on far too long.
I got my job by starting my own law practice after working at a large tech company + a large law firm. Most lawyers work at firms for several years (or forever) instead of doing their own thing. I did contract programming work for many years before starting my practice so I knew what I was getting into by starting my own business. Without my background in programming I wouldn't have most of my clients. I also get a lot of inbound leads because my website does well on Google (re: former web developer).
My job is awesome because I get to help entrepreneurs start their businesses/keep them running smoothly. I'm independent so I can offer reasonably-priced, fast and flexible service. Another benefit of being independent is that I get to work for businesses that other people might not touch such as Bitcoin-related companies. It's more interesting than being a programmer because I get a higher-level view and work with a wide variety of people. I still keep one foot in the world of programming by building my own services on the side.
I really thought that this is what I wanted to do. I could solve interesting and new problems every and be given a chance to explore what is effectively the infinite depths of computing at a very technical level.
I was really happy initially but as time has went on, I've become more and more aware that customer-facing IT has a lot less to do with solving technical problems, and a lot more to do with solving social problems - Talking to people and setting expectations and bridging a gap between what should actually be done and what the customer thinks should be done....and I'm not good at that. I want to solve difficult, technical problems. I don't want my job description to be 'I primarily deal with people and technical competence is secondary'.
So I'm not there yet, but I feel this experience has helped really carve out what I do want for a career, so I'm glad I've had the experience. If there is one thing it helped define, is the quality of people I work with. I get to work with awesome people, and if I could find these kinds of people doing a more technically-focused job, that might be the dream.
- I got into IT after being a warehouse worker and "being good with computers" there. I did air guitar to land my first IT job.
- My current job is awesome because as a nonprofit everyone is very huggy-feely, lots of psychiatrists leading the way and whatnot. It is also awesome because I am given free reign to come up with solutions, design new systems and generally tinker with a home-built lab using old equipment I was loaned by this non-profit.
- We get several different models of charity licensing from various vendors in addition to having some of the best negotiators around. A non-profit running all EMC storage, backup and replication? Pretty awesome.
I love the work in general and would do it for peanuts, almost.
There's a solid, recent book on this very subject by Robert Kaplan. Here's his presentation from Talks @ Google, https://www.youtube.com/watch?v=8sY-qwEYjs0
Now I work for an ISP (I worked with one of the directors about 10 years ago) 10mins from home, working on everything that needs doing, and I would say that this is a dream job.
I guess it depends on your circumstances, and what you need at the time. Dream job is such a loose phrase. They have both had their downsides, but in happiness levels they have both excelled!
Edit. I should say that I worked at various places before, and none really compare, but they did give a good grounding and helped me figure out what it was i wanted from a job. (Hint:the money is nice, but after a while job satisfaction and quality of life take precedent)
My job isn't awesome but I certainly don't hate it. I would much rather be working for a law enforcement agency. But I'm at a solid point in my current career, and I was going to work for the FBI but I would have to take a pay cut and it would take 2 years to get back to the pay I currently make. So add that in with a house and family and I just don't think I'm going to make a career move. So I just try and volunteer places to do criminal analysis and may pursue my PhD.
Since you are rebranding, you might be better off finding a name where the .com is available just to avoid the hassle of being shaken down if you get big later. Companies that just ponied up the money when they got big are dropbox (formerly getdropbox.com) and facebook (formerly thefacebook.com). It seems like recently you don't even need the .com (e.g. famo.us and socket.io), so you could go that route if your name is short and easy to remember.
The broker doesn't want to take your offer because he won't make much commission. He'd rather hold out for a bigger offer later on.
Circumvent the broker. Owner may be more likely to take your offer.
Don't get fixated on things. A business must evolve and adapt in order to succeed.If you are stubborn about a simple name, the web domain is the least of your worries.
1. Reformatting and content archival (lag times of hours to days are no prob).
As an example, I put together http://yareallyarchive.com to archive comments of a ridiculously prolific commenter on a site I follow. I needed the content of his comments, as well as the tree structure to shake out all the irrelevant comments leaving only the necessary context. Real time isn't an issue. Up until recently it ran on a weekly cron job. Now it's daily.
2. Aggregating and structuring data from disparate sources (real time can make you money).
I work in commercial real estate. Leasing websites are shitty and the information companies are expensive and also kinda shitty. Where possible we scrape the websites for building availability but a lot of time that data is buried in PDFs. For a lot of business domains, being able to scrape data in a structured way from PDFs would be killer if you could do it! I guarantee the industries chollida1 mentioned want the hell out of this too. We enter the PDFs manually. :(
Updates go in monthly cycles, timeliness isn't a huge issue. Lag times of ~3-5 business days are just fine especially for the things that need to be manually entered.
This is exactly the sort of scraping that Pricenomics is doing . They charge $2k/site/month. Hopefully y'all are making that much.
3. Bespoke, one shot versions of #2.
One shot data imports, typically to initially populate a database. I've done a ton of these and I hate them. An example is a farmer's market project I worked on. We got our hands on a shitty national database of farmers markets, I ended up writing a custom parser that worked in ~85% of cases and we manually cleaned up the rest. The thing that sucks about one shot scrape jobs from bad sources is that it almost always means manual cleanup. It's just not worth it to write code that works 100% when it will only be used once.
Make any part of structuring scraped data easier and you guys are awesome!
Anything that is released at a certain time on a fixed calendar, you can bet that multiple parties are trying to scrape it as fast as possible.
If you can scrape this data( the easy part), put it in a structured format( somewhat hard) and deliver it in under a few seconds(this is where you get paid) then you can almost name your price.
It's an interesting niche that hasn't been computerized yet.
If you can't get the speed then the first 2 steps can still be useful to the large number of funds that are springing up using "deep learning" techniques to build a portfolio over timelines of weeks to months.
To answer the question of: > Wouldn't this require a huge network of various proxy IPs to constantly fetch new data from the site without being flagged and blacklisted?
This is why I gave the caveat of only looking at data that comes out at certain times. That way you only have to hit the server once, when the data comes out, or atleast a few hundred times in the seconds leading up to the data's release:)
- Product pricing data: Many companies collect pricing data from e-commerce sites. Latency and temporal trends are important here. Believe it or not, there are still profitable companies out there that hire people to manually scrape websites and input data into a database.
- Various analyses based on job listing data: Similar to what you do by looking at which websites contain certain widgets, you can start understanding job listing (using NLP) to find out which technologies are used by which companies. Several startups doing this. Great data for bizdev and sales. You can also use job data to understand technology hiring trends, understand the long-term strategies of competitor's, or us them as a signal for the health of a company.
- News data + NLP: Crawling news data and understanding facts mentioned in news (using Natural Language Processing) in real-time is used in many industries. Finance, M&A, etc.
- People data: Crawl public LinkedIn and Twitter profiles to understand when people are switching jobs/careers, etc.
- Real-estate data: Understand pricing trends and merge information from similar listings found on various real estate listing websites.
- Merging signals and information from different sources: For example, crawl company websites, Crunchbase, news articles related to the company, LinkedIn profile's of employees and combine all the information found in various source to arrive at meaningful structured representation. Not limited to companies, you can probably think of other use cases.
In general, I think there is a lot of untapped potential and useful data in combining the capabilities of large-scale web scraping, Natural Language Processing, and information fusion / entity resolution.
Getting changing data with low latency (and exposing it as a stream) is still very difficult, and there are lots of interesting use cases as well.
Hope this helps. Also, feel free to send me an email (in my profile) if you want to have a chat or exchange more ideas. Seems like we're working on similar things.
I have two main recurring scrapes:
- political donations. Every donation to a political party in my province above ~$300 is posted publicly on a gov't website (in a PDF). I use the data to run machine learning algorithms to predict who is most likely to want to donate to my party.
- public service expenses. My province has a "sunshine list" which publishes the salaries and contracts for all senior government officials. We grab it weekly (as once someone quits the gov't, their data disappears).
One tool that you could consider building is an easily accessible expense website, where people can enter the name of a public official and see all their expenses, including a summary of the total amount spent. There have been a number of massive expenses here in Canada related to this [1, 2].
 http://news.nationalpost.com/tag/alison-redford/ http://en.wikipedia.org/wiki/Canadian_Senate_expenses_scanda...
No doubt the companies would justify this by saying e-mail isn't secure enough. The side-effect that it'll stop many users bothering to look at their bill isn't why they do it at all, no sir.
I've been considering making a web scraper that goes to the phone company, electricity company, gas company, broadband company, electronic payslips, bank, stockbroker, AWS and so on; logs in with my credentials; downloads the PDF (or html) statements; and sends them by e-mail.
Of course, such a web scraper would need my online banking credentials, so I'm not in the market for a software-as-a-service offering.
Managed to bag a lot of stuff over the last couple of years for not much money.
If someone bags this up as a service I'd pay for it.
I'm one of the team behind the crawl itself. Last month (July) we downloaded 4 billion web pages. Thanks to Amazon Public Datasets, all of that data is freely distributed via Amazon S3, under a very permissive license (i.e. good for academics, start-ups, businesses, and hobbyists). If your hardware lives on EC2, you can process the entire thing quickly for free. If you have your own cluster and many many terabytes of storage, you can download it too!
People have used the dataset to generate hyperlink graphs, web table content, microdata, n-gram and language model data (ala Google N-grams), NLP research on word vectors, and so on, so there's a lot that can be done!
: http://commoncrawl.org/: http://webdatacommons.org/: http://statmt.org/ngrams: http://nlp.stanford.edu/projects/glove/
I've started working on a new website that will use data scraped from several vbulletin forums. I've found that even 2 vbulletin forums running the same version may have completely different html to work with. I'm assuming that it's the templates they are using that changes it so much.
I'm setting up the process so that the webscraping happens from different locations than the server were the site is hosted. The scraping scripts upload to the webserver via an api I've built for this. Mostly did this because for now I'm just using a free pythonanywhere account and their firewall would block all of this without a paid account. And then also none of these sites would see the scraping traffic coming from my website, etc...
Been scraping a lot lately but mostly:
- government website for license holders
- creating lists of businesses for different segments (market research/analysis)
- using those lists to scrape individual sites and make analysis (how many use facebook/youtube/etc)
Well-formed HTML is the exception rather than the rule and page navigation is often "interesting". Sometimes the school's system will use software from companies like Sungard or PeopleSoft, but there's customization within that... and of course, there's no incentive for the schools to aggregate this information in a common format (hence MyEdu's initiative), so there are plenty of homegrown systems. In short, there's no one-size-fits-all solution.
* NOTE: If you do attempt this, I insist that you teach throttling techniques from the very start. Some schools will IP block you if you hit them too hard; other schools have crummy infrastructure and will be crushed by your traffic. Scrape responsibly!
I have a dream to use something closer to OCR against a rendered page, rather than parsing DOM. That way it would be less custom, and I could say, for instance, "find 'protein', the thing to the right of that is the protein grams".
I, personally, don't know how to do this, but I'd be willing to pay for a more generic way to scrape nutrition data (email in profile :) )
I'm looking to buy a house, and not all local estate agents post to Rightmove (or some post with a 24-hour delay). Trying to submit the search form on the agent's own hideous website, parse the results and get a standard structure between them all is hideous - I gave up in the end.
Once I have the data the challenge is then analysing it (geolocation, how long are the commute times, distance to amenities etc) which is its own separate challenge
I also did some e-commerce information scraping.
One of the most interesting one was for a data selling company. They asked me to collect data of geo information, disaster, finance, tweets etc. We used to apply ML and statistics to give forecast with historic data.
People who put in a few hours of work to take advantage of other people's hard work piss me off. :/
Here is an example of the (un-finished) side-project: http://recappd.com/games/2014/02/07
I'm far from the only person scraping this data. Look at sites liked http://vorped.com and http://nbawowy.com for even better examples.
Anyway... it turns out that flight APIs are ridiculously non-existent. I ended up scraping two different airline sites, but since it was against their terms, I never took the site any further.
One thing I am having a hard time scraping backlinks to websites. Currently using bing but they are paid after like 5000 queries. I really wonder how other companies like seomoz do this daily against millions of websites.
pre-crawled copies with distributed processing platform could be cool. you could come up with a better search engine with programmable rules that are edited collaboratively (like wikipedia)
but twitter is so vast you may want to categorize account.
But reddit is a good source for a lot of info.
maybe instead of trying to change up the content, try to change up the method. ie. do a talk on running crawlers/scrapers to seed your database at an interval. (instead of just "scraping").
2. Gathering basic data that should be freely available anyway (like currency exchange rates, global weather, etc.). Always this is done carefully and with a light touch, with maximum respect for load inferred on targeted systems. Again, haven't bothered in about five years.
3. Automating content acquisition. For search engines, media libraries, etc. This is more like ten years ago. These days there's so little call for it... maybe if I ran a boutique hotel chain in a copyright-isn't-respected jurisdiction and wanted to provide a fat library of in-room entertainment...
When I look back on my career the patterns that lead to good development were:
1) small team sizes, I'm a big believer that if version 1 of your product is developed by more than 4 people, its in trouble:) Small, super focused, and highly talented teams make the best version 1 products in my opinion, probably related to why some start ups can be more agile.
2) Everyone on the team has domain experience, ie its not the first time a product has been written. This is slightly at odds with the "second system syndrome". At my current company, when we wrote our algo platform, each member had done this before so we knew from a data, networking, machine learning, and trader's perspective what we wanted to get done. There was very little flailing around trying to learn the domain( ie no learning what ml techniques to use, how to connect to exchanges via FIX, no learning what a pairs trade was, etc).
3) 1, and only 1 person in charge of the vision. This might be obvious but debates, even when they are well intended, seem to slow things down. Having one person dictate what the next version will have seems to make things much easier. This is especially obvious in my current field of finance. Its very easy to spot the products developed by engineers for traders, vs the products developed by traders for traders, The former have lots of features that no one wants but they look pretty and the later, look ugly but make money:)
META NOTE TO ANYONE DEVELOPING A TRADING SYSTEM No one cares what it looks like. I'll say that again, no one cares what it looks like. The Bloomberg terminal is the ugliest thing on the planet and they mint money. Function over fashion, always. I'd go as far as to state that a small team developing a trading system having a designer is viewed in the same light as a small team having an mba. That person might add value, but you'll need to justify why you're there instead of another engineer.
I think alot of not shipping can be tied to these three things, too large of a team, not knowing what the final product will be doing and hence alot of experimentation and wrong turns and competing visions, or a lack of vision of what you are building.
Us: "Well, we just got the roof on and all the walls sheetrocked!"
Them: "We want all the walls moved now."
Lately it's also been the strangely recurring request to do the impossible. I've actually been asked to use cross site scripting ("like the hackers do - why can't you do it") to implement features a customer just had to have.
* Bad estimates. When people are too aggressive with their estimates and miss them, it derails everyone that was depending on that team's stuff being done at a particular time. It causes blockages that cost way more than an over-estimate would have reserved as buffer time.
* Getting distracted. Dev should work really closely with design and PDM to get things done, but if Design / PDM has already moved on to the next project, it's distracting for everyone when they have to get pulled back in. Then, you have multitasking, and blockages, and stuff doesn't get done as smoothly as it should.
Years ago, that was called Build Management, now, we call part of it DevOps. But the reality is that most teams have forgotten about professional engineering.
Pausing to think about the 'how' rather than just hammering it out, and living with the consequences are enormous. The fact that I rarely see separations of concern anymore, let alone focusing on reducing build times or test run times, is massive. Ultimately, that means tradeoffs - the software triangle comes into play.
It's especially interesting to me, since I'm (today) working with a team I last worked with 12 years ago. There's been massive flux - only three of twenty originals are here. And yet, with multiple geographies, multiple age groups, multiple operating systems, and three distinct cloud providers (let alone the internal wannabe cloud) they push out a valuable release weekly, and micro-push to production ~3x daily. That's a testament to the engineer who manages the team, the same engineer I worked with on build workflows over a decade ago.
Luckily, our deployments are automated and we deploy to our testing environment multiple times a week, sometimes multiple times per day. That at least enables us to get feedback from the test team, which results in tickets either being moved to 'really done' or moving them back to 'in progress'. It's not the same as shipping, but in such an environment it at least reminds you of the important fact that things are indeed still moving and getting done.
In reality, for the vast majority of launches very people are watching, and even fewer care about your features or lack thereof (that's just not how early-adopters think about products, in my opinion).
If founders realized how little traffic / how few downloads they were going to get out of the gate, they'd ship much earlier; unfortunately everybody thinks their project is going to be a TechCrunch headline, and that's just not the case.
We'll start a project and then half the devs will go on vacation. They return, and then the other half go on vacation. I guess that's summer for ya, but why not just close the office for a month and let everyone know that they should take their vacations around that time.
Devs will raise complaints about the work environment or a poorly written (but critical) module, but nothing gets done about it.
If a feature doesn't have strong executive support, nothing really gets done on it, and 3 months later someone will ask "hey, what happened to that feature?"
I'm going to single one out though: tweaking: making changes without adequate anticipation of the effects, or without the theoretical backing to expect that it will be correct. When you have a well structured high level understanding you can make changes knowing approximately what the effect will be and converge on the solution. Without that you end up thrashing around and making changes at random. If you take a random walk you're probably not going anywhere fast.
When you are dealing with a well designed and executed system, then tweaking actually seems productive, because you begin from a good place in the solution space, when you explore the "local neighbourhood", then the modifications still produce a functional piece of software and it might actually be better in some ways that you care about.
When you are making something new, tweaking gets you nowhere.
EDIT: Here is a foolproof process to get me to automatically disregard all your future ideas:
1. Find some parameters that were chosen with theoretical justifications and a real analysis of historical data.
2. Modify one of the parameters based on some flimsy rationale.
3. Run some quick tests that are obviously designed specifically to confirm your expectations, declare victory and act like you've solved something.
Not having a clear and concise plan. For software development this means obviously a detailed and worked out feature set.
It amazes me how often a development team is set out on the journey of building a system without clarity of what it is it's trying to accomplish. Sure, the high level functionality is there but as developers you need to know low level stuff. Usually this due the lack of leadership. A word to the "CEO's" out there;
A vision that's not formalised in a document and shared with the rest of the team is not a vision, it's fantasy.
Sure, I get it. For you, the CEO, it's easier to make stuff up as you go along than it is to write stuff down and to commit to it.
As a developer, when you try to get functionality formalized then those meetings to discuss the functionality turn into "design" meetings where everyone has to come up with new "awesome" ideas which means that at some point the team stops having meetings to discuss things.
It doesn't matter how much experience I have as a developer, I simply can't cram the full time role of project/product manager and at least 40 hours of writing quality software into a single week. Something has to give.
It also depends on what domain your company is in. If downtime doesn't lead to anyone's death, doesn't send you to prison or millions in lawsuits from Enterprise customers...sure ship daily.
Problems are dam hard: What we do is really hard, so basically we have to trim our projects to the basics, instead of doing what we want to do, which is way more. Creative people want to start-create always way more projects that what they could finish, so we need discipline here.
*Complexity and lack of documentation. Another thing that needs a lot of discipline. People believe that what they know after thinking on it for a lot of time is evident for everybody else. It is not. Not only that, people forget what they know today if they do not document it, so if you do not document a year from now you will repeating most of your work, with the company's budget.
Tools: The longer it takes to release, the less it will be done.
Culture: The harsher the response to failure, the less risk shall be taken.
Only unchanging thing is the deadline.
1. Fear of failure. Too many people are afraid to ship too soon. They fear a bad product out the gate, rather than getting out the gate in the first place.
2. Lack of focus. You'll spend too much time focusing on things that don't really matter, such as building the perfect messaging system rather than reusing something that exists and works now.
3. Busywork. You'll spend your time doing things that really don't matter, such as optimizing your icons to use fonts instead of a PNG, and then dealing with trying to make it work in older versions of IE.
"Unlimited", which used to be unlimited but has some kind of traffic control recently introduced.
No tethering, and this is strictly enforced. (there are separate deals for tethering and for dongles and tablets.)
3G only. They've only just got proper Apple iPhone support.
China, 3G, China Unicom, Monthly, 286, 0, 46, 0, 900, 1100, Yes
USA, 2G, T-Mobile, Daily, 2, 0, 2, 0, Unlimited, Unlimited, No
In the US almost all prices don't include the tax. Cellular services are particularly bad about this due to the way the US tax system works, they fund things like 911 operations from telephone and cellular bills.
So in the US a $20 list price might be closer to $35 in real terms (i.e. the amount they debit from your account every month).
Jamaica, on Digicel's network, HSPA+, pay as you go, tethering allowed, 2GB.
US,4G, $60/month, unlimited, no tethering, month-by-month contract
Thailand,4G, $15/month, 3GB/month, tethering, pay as you go
Germany,4G, $25/month, 3GB/month, no tethering, month-by-month contract
You have my upvote though! :)
Just build stuff.
Make use of virtualization and just start building systems. There is more then one 'role' a sysadmin will play and in some areas each specialty could be its own job.
i.e. A Windows/Active Directory admin vs an Exchange Admin.
Most good admins will know how to do a bunch of various duties, setup a Samba server; building a ZFS array; install a printer driver; configure a company Wiki and all the maintenance needed to maintain these systems.
And learn how to automate 90% of your tasks.
It's your choice if you want to be picky. If you want to only work on Linux systems, or to be more anal; if you only want to work on Debian servers, don't be surprised if it's harder to get a job. Spread yourself out and learn to be a Jack-of-all-trades. I prefer working with linux, but I jump up and resolve Windows headaches for my co-workers because that annoying 10% of the job makes me more attractive to my boss then the neckbeard who is a distro snob and refuses to touch anything except OpenBSD.
Build systems to do one job.One DHCP server, one DNS server, one file server. Then start combining them and optimizing them. break them and then fix them. There are a million 'gotcha's that only show up when you start working with the systems and you never see in youtube videos.
As far as finding work; either 'exaggerate' about some of your experience while contracting, or start working help-desk and get your foot in the door.
Your resume will get you an interview, your real skills will get you the job.
Edit: oh, now it's "Verdana, Geneva, sans-serif" where it used to be "Verdana, sans-serif". So if you have Geneva but not Verdana, and Geneva was not already your sans-serif font, it will look different.
(0) Can you link to the product?
(1) When did you start and what has the growth trajectory been (e.g., weekly growth rate)?
(2) Is there some marginal cost other than server space that is limiting growth?
Makes sense, somewhat. Even in the case of minor fraud, they would likely spend more in manpower than they could possibly reclaim from such a low-earning outfit. You still have to file, of course, but it's properly simple if you didn't earn a significant amount.
EDIT: corrected from $200,000 to one million.
Still, even though I knew exactly how much I owed, when I saw $180k on the statement my first call was to my lawyer, for a hug.
Things like workmans comp, sales tax, board meetings and minutes requirements, insurance requirements and the like. No need for lots of explanations, just a list of all the major things you need to do to run a C-Corp legally.
Does anyone know of such a list?
I think in the end we owed ~$3k. What a waste of everyone's time.
Also, for any companies doing business in New Jersey, there's a $500 minimum income tax even if you make no revenue - and even if you put the business on hold but still keep some assets on the corporation's books at any time that year - so be aware of that. It's somewhat ironic that for companies operating at a loss, states often refuse to waive any minimum tax - would they not wish the companies to be able to put that money towards their success, and to have a higher probability of generating both jobs and much higher income taxes in their states in future years? Though I suppose New Jersey is not the state best known for forward-thinking political practices...
Scratch that, that goes out to everyone. Get an accountant. Stop guessing at your taxes. You'll save more than you spend. That's their business.
Every February we have frantic calls from our clients when they receive their tax bills."
LOL.. just founder things
Ended up being easier to pay up than dispute it (~$250).
You registered in Delaware without figuring out how much actual taxes you will have to pay. If you did, you would know right away that that invoice was bogus.
Not sure why this is on the front page.
edited for clarity
That said, I recommend you look into stylus. It is better than either less or sass in my opinion as it gets rid of a lot of unnecessary syntax.
I'd still be interested in hearing the arguments in favor of SASS.
I believe most people using SASS do because of Rails.
When your attention is split on two different and identical campaigns, both of them will fail. Choose one and make it count.
I might like it if the fundraisers were complementary and addressed to the same target groups, but actually different.
Also Kickstarter requires you meet your goal in order to get any money. It would be a shame if you got half the money on IndieGoGo and your Kickstarter failed because of it.
I have yet to use Crouton on my Chromebook because I have another laptop running Ubuntu and I just haven't found the need yet (though to be honest I haven't picked up my old laptop since I got the Chromebook).
Browserstack lets me check out what the web pages look like in other browsers.
If you do use Crouton, then you should basically have a Ubuntu box. Most Chromebooks have an SD Card slot and USB slots as well if you need extra storage.
Basically you can do anything you can do with Linux if you install Crouton, i.e. it implements Debian like distro in a chroot jail.
I used a Samsung Chromebook for about six weeks and then went out and bought a 13" MacBook Air (used a MacBook Pro before the Chromebook adventure).
The MBA has a bigger screen, better keyboard, way better touch pad. I still throw the CB into the bag when I travel. But most of the time the MBA is the preferred system.
With crouton you have access to traditional Unix tools so if most of what you need is command-line tools and a text editor, you can manage. It's not amazingly fast and the screen resolution is a bit limiting. But it's still a great machine : cheap, light and powerful enough for web dev. I don't regret buying one.
ITYM a hyperbolic quote. A parabolic quote would describe an arc like a thrown ball, which eventually falls back to earth. A hyperbolic quote would escape toward infinity, which is the intended sense of the expression derived from the shapes of conic sections.
Letts' Law: All programs evolve until they can send email.
Zawinski's Law: Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can.
Greenspun's tenth law: Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.
RTM provided a "corollary which clarifies the set of "sufficiently complicated" programs to which the rule applies: including Common Lisp." http://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule
Useful feature, that." -- Marcus J. Ranum, Digital Equipment Corp.