As far as why it doesn't have traction, I would attribute this to 2 things. The first being that the JS community isn't really into app development. What I mean by that, is when you look at JS, and you look at languages like Flex and Silverlight, you notice a massive disconnect in ideas. In the JS world, visual components are rarely if ever first class citizens, simple layout tasks are far more difficult than they need to be, and SPA paradigms are also second class citizens. After noticing this and poking around, you find out that the main JS community either doesn't want the solution to begin with, or they seem to be re-inventing the wheel.
The second strike against ExtJS is the cost. I got a license for it a few years ago when I was delusional and thought I could use it to replace Flex. Dropping $1,000 on something when virtually every other tech out there is free is hard to choke down.
Also, it seemed like a major step down for those of us coming from the Flex community. Like I said, with JS app paradigms seem like afterthoughts and each peculiarity wore me down further and further. For instance, if you have a button and you declare a listener function but misspell it, you don't get an error complaining that the listener doesn't exist, you get a blank white screen. Another issue was marshalling data into objects. Whenever I brought back a Classroom JSON representation, the Student objects in the students collection remained unmarshalled. I had to use ANOTHER store to get those to behave as objects. All of these "why on earth is this acceptable" moments led me to believe such features weren't really in demand in the JS community if not resented altogether.
I've used Sencha's frameworks on 3 different projects now, and I still hate it. For many different reasons:
1. The learning curve: This is probably something you already know. But you may not be aware of is the other effects this can cause. If your team consists of members of different level of experience (from 0+), the code quality is going to suffer a lot from the lack of knowledge.
2. ExtJS and Sencha Touch are 2 different products. Today's web app demands to be runnable on many different platforms. With ExtJS you get point and click. With Sencha Touch you get swipe and pinch. But what if you're running it on a touch laptop that needs both point and click and swipe and pinch? I'm working on a product right now that integrated the 2. While we succeeded at reaching the goal, the result is not something I'm proud of.
3. Sencha is a walled garden that makes it hard to integrate other libraries into it. For the most part, Sencha already have a lot of things that you needed (like data grid, combo box, etc). But what if you want to do complex data visualization with d3? Complex interactive behavior with Rxjs? Or realtime updates with socket.io?
Eventually, I found out that no matter which framework you choose to use, you'll end up needing to read its source code to understand what's underneath. If you have to do so anyway, why not pick a framework that's simple to understand?
It seems the ExtJS community has fallen off a bit as the company has (in my opinion) put it in the rear-view as they focus on Sencha and mobile. I also never really felt that I have seen what a true ExtJS app could be. The examples are the same as they were 5 years ago. Only the theme has been changed. You get a lot for free with the datagrid and charts but they aren't the only game in town for this.
No one wanted to build customer facing ExtJS apps because of performance issues, hard to style so that it doesn't look like some dull enterprise thing and at the time Rails was just getting popular. I havn't touched ExtJS in a while but building an admin CRUD in Rails would be much easier. The extra features of ExtJS can be gotten elsewhere and easily integrated.
Angular just seems like a breath of fresh air when comparing it to ExtJS, and again most of the components can be found elsewhere.
If I want to manipulate the DOM or do AJAX I'll use JQuery, if I want to do data binding I'll use knockout.js, etc.
Never ever use it for a client who has any ambitions on GUI design. Figuring out how to put an extra pixel here or there can take hours...
The licensing is pretty restrictive also and the prices are big.
I wouldn't say ext has no traction. I see sencha going pretty well and I've worked with extjs since the early days. It just looks more "business oriented"
The less you have need for that series A, the easier it will be to obtain.
Beyond that, spend the next few months focusing on your product and market, then if you really want that series A, meet with as many potential investors as you can, but not to pitch; you want a nice competitive round when the time comes. The worst series A failure is a completed one with bad terms.
It's small, so the overhead of fetching it is minimal. Once fetched it can be cached practically indefinitely and re-used by any sites that uses the same CDN, while still making live updates possible.
Bundling it would only provide a stale version, add bloat and open up a floodgate for other inclusions.
Lesson #1: Don't code when you're distracted.
Some hours later, the problem manifested. The queue workers came down, and AR (which is totally dependent on them for its core functionality) immediately stopped doing the thing customers pay me money to do. My monitoring system picked up on this and attempted to call me -- which would have worked great, except my cell phone was in a box that wasn't unpacked yet.
Lesson #2a: If you're running something mission critical, and your only way to recover from failure means you have to wake up when the phone rings, make sure that phone stays on and by you.
Later that evening I felt a feeling of vague unease about my change earlier and checked my email from my iPad. My inbox was full of furious customers who were observing, correctly, that I was 8 hours into an outage. Oh dear. I ssh'ed in from the iPad, reverted my last commit, and restarted the queue workers. Queues quickly went down to zero. Problem solved right?
Lesson #3: If at all possible, avoid having to resolve problems when exhausted/distracted. If you absolutely must do it, spend ten extra minutes to make sure you actually understand what went wrong, what your recovery plan is, and how that recovery plan will interact with what went wrong first.
AR didn't use idempotent queues (Lesson #4: Always use idempotent queues), so during the outage, every 5 minutes on a cron job every person who was supposed to be contacted that day got one reminder added to the queue. Fortuitously, AR didn't have all that many customers at the time, so only 15 or so people were affected. Less than fortuitously, those 15 folks had 10 to 100 messages queued, each. As soon as I pressed queues.restart() AR delivered all of those phone calls, text messages, and emails. At once.
Very few residential phone systems or cell phones respond in a customer-pleasing manner to 40 simultaneous telephone calls. It was a total DDOS on my customers' customers.
I got that news at 3 AM in the morning Japan time, at my new apartment, which didn't have Internet sufficient to run my laptop and development environment to see e.g. whose phones I had just blown up. Ogaki has neither Internet cafes nor taxis available at 3 AM in the morning. As a result, I had to put my laptop in a bag and walk across town, in the freezing rain, to get back to my old apartment, which still had a working Internet connection.
By the time I had completed the walk of shame I was drenched, miserable, and had magnified the likely impact that this had on customers' customers in my own mind. Then I got to my old apartment and checked email. The first one was, as you might expect, rather irate. And I just lost it. Broke down in tears. Cried for a good ten minutes. Called my father to explain what had happened, because I knew that I had to start making apology calls and wasn't sure prior to talking to him that I'd be able to do it without my voice breaking.
The end result? Lost two customers, regained one because he was impressed by my apology. The end users were mostly satisfied with my apologies. (It took me about two hours on the phone, as many of them had turned off their phones when they blew up.)
You'd need a magnifying glass to detect it ever happened, looking on any chart of interest to me. The software got modestly better after I spent a solid two weeks on improved fault tolerance and monitoring.
Lesson the last: It's just a job/business. The bad days are usually a lot less important in hindsight than they seem in the moment.
She found out about it pretty quickly due to having syslog be a constant presence in one of her gnu screen windows and gave me a look. She quickly reverted what I did, updated our config management tool, tested it, then deployed it, while explaining why this was the right way to do things. I slowly came around to doing things the right way and haven't thought much about the initial incident until we found her personal logs that she archived and left on our public network share for future reference.
In the entries for the day that I started, we saw the following two lines:
[*] 2007/09/09 09:58 - yan started. gave sudo privs and initial hire forms. [*] 2007/09/09 10:45 - revoked yan's sudo privs.
I was a little sleepy one morning and accidentally connected to prod instead of testing. I thought, "That's weird, this UPDATE shouldn't have taken so long-oh shit." I'd managed to clear all allergy and malignant hyperthermia fields. For all I knew, some anesthesiologist would kill a patient because of my mistake. I was shaking. I immediately found the technical lead, pulled him from a meeting, and told him what happened. He'd been smart enough to set up hourly DB snapshots and query logs. It only took five minutes to restore from a snapshot and replay all the logs, not including my UPDATE.
Afterwards, my access to prod was not revoked. We both agreed I'd learned a valuable lesson, and that I was unlikely to repeat that mistake. The tech lead explained the incident to the higher-ups, who decided to avoid mentioning anything to the affected hospitals.
If it's any consolation, the company is no longer in business.
Just remember when you screw things up: Your mistake probably won't get anyone killed, so don't panic too much.
When I worked at Subway, the bread dough came frozen, but you would put loaves in a proofer, proof it for a certain amount of time, and then bake it. My first shift, however, got busy and I left several trays in the proofer for a very, very long time. Consequently, they rose to roughly the size of loaves of bread, as opposed to the usual buns.
It was my very first shift alone at any job in my life, so I did the most logical thing I could think of and put the massive buns in the oven. They cooked up nicely enough and I thought I was saved. Until I tried to cut into one.
Back in that day, Subway used to cut those silly u-shaped gouges out of their buns. In retrospect, I think this was most likely a bizarre HR technique designed to weed out the real dummies, but at the time I was oblivious (likely because I was one of the dummies they should have weeded out). When I ran out of the normal bread, I grabbed one of my monstrosities, tried to cut into it, and discovered that it was not only rock hard, but the loaf broke apart as I tried to cut it.
That night, my severe shyness and social awkwardness had their first run-in with beasts known as angry customers. I was scared I would get fired, so I promptly made new buns, but spent the rest of my shift trying to get rid of my blunder. I discovered some really interesting things about people that night. First, you'd be surprised how incredibly nice customers are if you are straight up with them. Some customers I never met before met the big, crumbly buns as an adventure and, in doing so, helped me sell all the ruined buns.
In the end, I came clean (and didn't get fired). That horrible night was a huge event in the dismantling of my shell. It taught me an awful lot about ethics. And frankly, that brief experience in food service forever changed how I deal with staff in similar types of jobs.
Surprisingly it all seemed to work well. These disaster recovery steps weren't heavily tested before. Brilliant! I went to shut down the AWS instances. Kill DB group. Wait. Wait... The DB group? Wasn't it DB-test group...
I'd just killed all the production databases. And the streaming replicas. And... everything... All at the busiest time of day for our site.
Panic arose in my chest. Eyes glazed over. It's one thing to test disaster recovery when it doesn't matter, but when it suddenly does matter... I turned to the disaster recovery code I'd just been testing. I was reasonably sure it all worked... Reasonably...
Less than five minutes later, I'd spun up a brand new database cluster. The only loss was a minute or two of user transactions, which for our site wasn't too problematic.
My friends joked later that at least we now knew for sure that disaster recovery worked in production...
Lesson: When testing disaster recovery, ensure you're not actually creating a disaster in production.
I had had some test tables sitting around in the database for a while and decided to clean them up. I stupidly forgot to check the status of my backups; because of an earlier error, they were not being correctly saved.
So, I had a bunch of tables with similar names:
users_1024 users_1025 users_1026
Guess what got deleted along with them? The actual users table (which I've since renamed to something that does not even contain "users" in it).
So, how do you recover a users table when you've just deleted it and your backup has failed?
Well, I happened to have all of my users' email addresses stored in a separate mailing list table, but that table did not store their associated user IDs.
So I sent them all an email, prompting them to visit a password reset page.
When they visited the page, if their user ID was stored in a cookie -- and for most of them, it was -- I was able to re-associate their user ID with their email address, prompt them to select a new password, and essentially restore their account activity.
There was a small subset of users who did not have their user IDs stored in a cookie, though.
Here's how I tackled that problem:
Because the bulk of a user's activity on the site involves answering poll questions, I prompted them to select some poll questions that they had answered previously, and that they were certain they could answer again in the same way. I was then able to compare their answers to the list of previous responses and narrow down the possibilities. Once I had narrowed it down to a single user, I prompted them to answer a few more "challenge" questions from that user's history, to make sure that the match was correct. (Of course, that type of strategy would not work for a website where you have to be 100% sure, rather than, say, 98% sure, that you've matched the correct person to the account.)
Nobody was killed, but we had a few injured. Thankfully the brunt of it hit the MRAP in front of us. If it hit my vehicle (HMMWV, flat bottom) instead I probably wouldn't be here.
That was the first major operation on my first deployment, too. Hello, world!
My takeaway? Shit just got real.
We ended up stranded that night after the 3rd IED strike (our "rescuers" said it was too dangerous to get us). It was the scariest day of my life, but in similar future situations it was different. I still felt fear and the reality of the existential threat, but I accepted it. It was almost liberating. Strange.
I deployed for another year after that (to Afghanistan that time). After Afghanistan I left the Corps and started my company. Because if it fails, what's the worst that can happen? Lulz.
After a couple hours of swearing, instead of working from a root shell in my own account, I just logged into the GUI as root. And there was a pretty interface showing the disks. I could just click on one and format it. Hooray!
Well either the GUI was buggy or I clicked on the wrong disk, because as the format was going, I realized the external drive wasn't doing anything. I was formatting the internal boot hard drive. And since nobody but me gave a crap about this weird free box somebody had given them, they had repurposed it. As a file server. For the home directories of a bunch of my colleagues. Who were now collecting around me wondering what was going on. Oops.
No problem, says I. I'll just restore from backups. But this thing used a weird magneto-optical drive . The only boot media we had was on an MO disk. The backups were on another. And there was only one of these drives, probably only one in the whole state. The drives were, of course, incredibly slow, especially if you needed to swap disks. Which, I eventually discovered, I would have to do about a million times to have a hope of recovery.
Long story short, I spent 28 hours in a row in that chair. It was my immersion baptism  in the ways of being a sysadmin. The things I learned:
Fear the root shell. It should be treated with as much caution as a live snake.
Have backups. People will do dumb things; be ready.
A backup plan where you have never tried restoring anything may lead to more excitement than you want.
Be suspicious of GUI admin tools. Avoid new GUI admin tools if at all possible. Let somebody else be the one to discover the dangerous flaws.
If you were smart enough to break something, you're smart enough to fix it. Don't give up.
When some young idiot fucks up, check to make sure that they are sufficiently freaked out. If they are, no need to yell at them. Instead support them in solving the problem.
Seriously, my colleagues were awesome about this. I went on to become an actual paid sysadmin, and spent many years enjoying the work. The experience taught me fear, and a level of care that sticks with me today. I'm sure at the time I was wishing somebody would wave a magic wand and make it the problems go away, but working through it gave me a level of comfort in apparent disasters that has been helpful many times since.
 http://en.wikipedia.org/wiki/NeXTcube http://en.wikipedia.org/wiki/Magneto-optical_drive http://en.wikipedia.org/wiki/Immersion_baptism
After the test was complete, I forgot to turn off the Adwords. (Such a silly mistake...) Nobody notices until our bill arrives from Google, and it's substantially higher than normal. When my coworker came to ask me about it, "are these your campaigns?!?" I just sank in my chair.
I think it cost the company $30k. I suppose it's not that much money in the grand scheme of things, but I felt very bad.
~ 2007, working in a large bioinformatics group with our own very powerful cluster, mainly used for protein folding. Example job: fold every protein from a predicted coding region in a given genome. I was mostly doing graph analysis on metabolic and genetic networks though, and writing everything in Perl.
I had a research deadline coming up in a month, but I was also about to go on a hunting trip and be incommunicado for two weeks. I had to kick off a large job (about 75,000 total tasks) but I figured spread over our 8,000 node cluster it would be okay (GPFS storage, set up for us by IBM). I kicked off the jobs as I walked out the door for the woods.
Except I had been doing all my testing of those jobs locally, and my Perl environment was configured slightly differently on the cluster, so while I was running through billions of iterations on each node I was writing the same warning to STDOUT, over and over. It filled up the disks everywhere and caused an epic I/O traffic jam that crashed every single long-running protein folding job. The disk space issues caused some interesting edge cases and it was basically a few days before the cluster would function properly and not lose data or crash jobs. The best part was that I was totally unreachable and thus no one could vent their ire, causing me to return happy and well-rested to an overworked office brimming with fermented ill-will. And I didn't get my own calculations done either, causing me to miss a deadline.
1) PRODUCTION != DEVELOPMENT ever ever ever ever2) Big jobs should be proceeded by small but qualitatively identical test jobs 3) Don't launch any multi-day builds on a Friday4) Know what your resource consumption will mean for your colleagues in the best and worst cases5) Make sure any bad code you've written has been aired out before you go on vacation6) Don't use Perl when what you really needed was Hadoop
Now, if you go to a CNET site and view source, there's a <!-- Chewie loves you --> comment. I like to think of that as an homage to my original fuckup.
A relatively minor bug in the software that I wrote caused the safety curtain to stop triggering when a certain condition was met. We discovered this bug after an operator was injured by one of these machines. Her hand needed something like 14 stitches.
1. Event-driven code is hard.
2. There's no difference between a 'relatively minor' bug and a major one. The damage is still the same.
The billing specs kept changing, as did the specs for the show itself. New price points, more plans, change the show interface, add another option here, etc. The plan had been to do a free preview show the day before to work out the kinks. That didn't happen.
The time leading up to show start was pretty tense, lots of updates, even a few last minute changes! Then the show actually started, brief relief. The chat system built in started deleting messages, one of those last minute feature changes had screwed up automatic old-message deletion. We had a fix though, update the JS, and bounce everyone out of the show and back in so the JS updates. Fixed!
Then the CEO pointed out that the quality just kept getting worse. Turns out that while the video player had both a numeric value and a string description for the different quality levels, it assumed they were in ascending order. So once it confirmed it could stream well at a given level, it automatically tried the next, which worked! Poor quality for everyone. Fixed, and another bounce.
Then it was over, time to go home. Back in the next day to finish off the billing code. I decided to approach it like a time card system. Traverse the logs in order, recording punch in time, when someone punches out, look up their punch-in times and set that user's time spent to the difference. Remove punch-in and out from the current record so they're not used again.
Now two facts from above added up to a pretty serious bug.1) I _set_ the time spent to the difference between the two times. Not added, set.2) We bounced everyone from the show twice to update their JS, and video player. So everyone had multiple join/parts.
I under-billed customers by tens of thousands of dollars.
Things I learned:
- Don't just argue that you need a trial run, make sure management understands the benefits. Why, not What.
- Duplicate billing code. After that a co-worker and I wrote two separate billing parsers for things, 1 designed to be different, not efficient.
- Give yourself ways to fix problems after they crop up. The bounce killed my billing code, but not doing it would have damaged the actual product (which later became a regular feature). Wish that thing had been my idea.
I was commissioning a new control system at a power plant's water treatment facility. I was fairly new to the industry and had mostly looked over the guy who did the bulk of the work's shoulder as on the job training.
This particular day the guy was out sick and we had to finalize a couple of things before we ran through the final tests.
There was an instruction to open a valve to fill a tank and it had the wrong variable linked to it. The problem was to maintain the naming standards I had to do a download to the processor to make the change. When I had been doing work in the office this was not a big deal, download the program to the processor, it stops running for a moment while it loads the new logic into memory and starts back up.
Not thinking through the implications of the processor shutting down while the process was up and running I made the code changes, hit download and about 30 seconds later an operator came running over looking like he had seen a ghost and he was pissed.
While I was making my code changes the operator was hooking up a hose to drain a rail car of some chemicals. The way the valves were configured before I made my changes was correct and would have had no consequence it I didn't touch anything. The way the valves were configured when the processor restarted would have routed the rail car's contents to the wrong tank resulting in a reaction which would have created a huge plume of highly toxic gas. The way the wind was blowing this plume would have blown directly to the largest town in the area and could have killed a ton of people.
The operator heard the valves in question changing position before he opened the valve on his hose to empty the rail car and figured something was up. When he saw the whole process had shut down he got really angry because I had ignored the protocol in place to avoid such a disaster.
I got chewed out and kicked off the site. My boss attributed my mistake to inexperience and I had to give a safety presentation on what I did wrong.
Lessons learned:Be sure you are aware of any implications your actions have. If you are unsure or guessing about something stop what you are doing and go ask someone first.
Don't give people mission critical work on their first project and have them work unsupervised. Training is important.
Always be aware of safety requirements, especially when you are working with machinery, automated processes, chemicals or anything else that can hurt, maim or kill you.
My story (though I wasn't directly responsible): we were delivering our software to an obscure government agency. Based on our recommendation, they had ordered a couple of SGI boxes. I wrote the installation script, which copied stuff off the CD, etc. Being a tcsh afficianado, I decided to write it in tcsh with the shebang line
In the script, the first few lines were something like:
set HOME = "/some/location" /bin/rm -rf $HOME/*
It wasn't a but an hour before I lost sysadmin privileges.
Never "experiment" with a production system - ever.
I inherited a mess of an architecture and am finally getting around to rewriting our deployment process. We buy VM services from a local outfit and the prices are basically an arm and a leg for rather small machines. Due to this my predecessor put in place an insane deployment script. It pulls the new version from github then reloads code on the running dynos, one after another. Reverting is out of the question with our current approach to VCS (something I am also fixing). Most of the time this is no problem, all we are changing really is some template code, or introducing new models and their views.
Thinking back I am quite happy we don't run into more problems than we do, but also happy that this type of insanity is soon in the rearview mirror.
The worst mistake was recently, cost us about 4 hours of downtime during the busiest time of the day.
A big feature on all news sites are lists of stories to present to the user to look at after they have read what you put in front of them at the moment. They may take the form of most viral, most read, most commented, sliced by time or category or many other factors. My predecessor had written all those lists statically, which made maintenance a nightmare and extension very fragile.
I made a function that was a generic list of items. You supply basic parameters, amongst them a QuerySet for what would construct the list and my function would check to see if it was cached and if it wasn't, generate it and cache it.
The framework I use (Django) generally uses lazy evaluation for all QuerySets and I rarely have to think about the size of the list I generate, I just take care to limit the query before I list() it. During development nothing showed up as a problem and I deployed this and all seemed to be good with the world.
A week passes by where I made at least 2 minor deploys (small changes to templates, minor tweaks to list filters) and all seemed to be good with the world.
Designer sends me a pull request, I look over the code, just some garden-variety template changes, nothing that should raise an eyebrow. Make the merge, plan to deploy and then go to lunch. Deployment done, all seems well for 2 minutes but then suddenly servers lit on fire. Pages spewed 404's and 500's like there was no tomorrow.
For 4 hours I tear my hair out, examine every piece of code I was deploying that day, call in the big gun support (the kind that costs more money than I care to think about). Everything I was looking at pointed to the caching agent not working. Too many pageviews requesting the database, too much load on the servers, reboots made them work fine for about a minute but then everything became bogged down.
The big gun support pointed something out finally that I had missed: Traffic from the database to the dynos was abnormally high. Made me take a look at code that had been there for a while and lo and behold: For some reason when you pass a QuerySet as a parameter, it seems to be evaluated for the receiving function! 2 lines of code added, one deploy, problem fixed.
I have no idea to this day how this code could be live for a week without causing problems but an unrelated change triggers the bad behavior. This is not be the first time I've seen strange behavior from code, having seen a Heisenbug in Java code.
There's a happy ending to this. I made a big mea culpa slideshow where I pointed out all the flaws and what we needed to do to prevent a re-occurence. I got support to make the changes needed and my new cluster goes live day after tomorrow. Now I can carefully change NEW dynos for a deployment, keeping the old one's around if the shit hits the fan. I got some changes instituted in how we approach VC, something that's hampered work for a while. And we save money in the long run because we will no longer be paying an arm and a leg for the VMs (AND I got to learn about clustering machines with HA, goodstuff with gravy).
We were storing payment details sent from a PHP system into a Ruby system, I was responsible for the sending and receiving endpoints. Everything was heavily tested on the Ruby end but the PHP end was a legacy system with no testing framework. Since the details were encrypted on the Ruby end, I didn't do a full test from end to end AND unencrypt the stored results.
Turns out for two months we were storing the string '[Array]' as peoples payment details.
Takeaway: If you're doing an end to end test, make sure you go all the way to the end.
i was the one developing the macromedia director app running on the cd.
we were on-time.
we were ready to send them out the door.
it was awesome.
and then we tested the rom outside of our network...
in some far-off corner of code, i had baked in a hard reference to one of our file servers on our network for some streaming assets. the cd failed as soon as you put it in the drive due to that reference to the missing file.
by the time we discovered this, we'd already glass-mastered and stamped 30,000 discs to the tune of $40k or so. or, about $6k per employee. in a company that booked about $50k the previous year. where i worked for free for 9 months.
so, my line of code cost our little company the equivalent of almost all of our previous year's revenue -- not profit, but revenue.
we, of course, had to make the run again -- only this time at the emergency rush prices. and this time, we were running late.
we managed to book some time in the middle of the night at the stamping plant. it was 4am. i had a courier standing over my shoulder watching me run the final build again, this time without the dreaded line of code -- which broke other things i had to fix when i removed it -- before he could take it.
i finished testing. ejected the disc. handed it to the courier, who started running as he was placing it into its case. he drove like hell to make it to the airport where we counter-to-countered it on a 2-hour, 6am flight to vegas for stamping.
oh, and it almost got even worse from there. almost.
we didn't know if they would be able to stuff the cds into the packaging because this was an emergency run and they didn't have the people available.
we were actually on our way to rent a uhaul which we calculated we could drive to vegas just in time for the stamping run to finish. from there, we would load the discs on their spindles, and 4 of us were going to sit in the back of the van, stuffing 30,000 discs while we drove the uhaul to palo alto. from vegas. yes, stuffing discs in the back of a traveling uhaul.
we even had the patio furniture from one of the employees yards already picked out to sit in while we were in the back of the truck.
luckily, the plant managed to squeeze in our packaging (at rush pricing, of course) and all we needed to do was have one of our guys take them as luggage on a later flight that day to the bay area instead.
as to a couple, big lessons learned?
1) i can honestly tell you, i've never, ever had a hard-coded, local network link in anything i've shipped since and never will again. always test off-network. especially these days with mobile apps and their on-off-network states.
2) a strong, non-finger-pointing team is where you need to be. i felt appropriately awful, but we handled it as a team and proceeded to grow that little company to about $40 million a year before a merger.
p.s. oh, and next time, remind me to tell you about the time i ran a database query on production that nuked the entire website for the publicly-traded software company which relied on -- wait for it -- the website to do all its commerce.
yup. that really happened.it was 4-5am in the morning and I'd been working all night. I was on the server trying to set something up and was trying to blow away a folder ... I did a normal rm and that didn't work (obviously) because there was crap in the folder. So I pulled out my nuclear weapon to nuke the folder but left off the preceding ./ (which still wasn't that smart anyway) ... I sat there for a second wondering why the deletion was taking so long ... then another 30 then a minute ... then I looked at what I'd just typed again ... then I realized what had happened.
ctrl-c'ed (or d, can't remember now) out of it. then tried to find root folders
cd /etc=> folder not found
cd /var=> folder not found
I'm from a third world country where we laugh at Americans (sorry) for throwing up when they're nervous or having panic attacks, but at that moment, I had a full blown panic attack. I'll never forget it.
The work was a subcontract for a client who was doing work for Nike, and it was a decently sized project that was critical to the success of the firm, and I'd just blown away their live production server ...
Afer freaking out and almost crying for 5 minutes. I decided to call media temple support (we were using one of their vps servers) ... and by the biggest absolute stroke of luck they'd just backed up the entire server ... not even 2 hours prior to my madness. $100 for a full restore (I don't recall why) and would I like to do that?
HECK YES I WOULD!
so they restored the server for me. I wrote an email to the head of the small company I was doing all the work for, explaining what I had happened and telling him I'd sent over a check for $100 to cover the backup because it was my fault. He was obviously very relieved and never cashed the check I sent.
I still get chills thinking about that exact moment when I thought I'd fucked up my career and reputation for good.
Key takeaway: always check the cam.
Second web related job at an insurance company, I was 20 years old at the time. We were heavy into online advertising, mostly banners at the time (this was right around when adwords started to get big). The company just bought out all of the MSN finance section of their site for the day-- it was a pretty big campaign ($100,000). We drove all the traffic to a landing page I had created with a short form to "Get a quote".
IT had given me permissions to push things live for quick fixes and such, I made a last minute design tweak and, you guessed it, broke something. I was checking click traffic and inbound leads and realized traffic was through the roof but leads were non-existent. This was about 45 minutes after the campaign was turned on. I jumped on the page and tested it out and got an error on submit. FUCK. I literally started to perspiration INSTANTLY.
Jumped into my form and quickly found the bug, can't recall what it was but something small and stupid, then pushed it live without telling a soul. Tested, worked, re-tested, worked. Ran some quick numbers to get a ballpark estimate on the damage I caused... several thousand.
Stood up and walked over to the two IT guys, mentioned I borked things and that I had fixed it... what should I do? I can still see the look on their faces. Shock, then smiles. Walked back to my desk and about 10 minutes later my two bosses show up (I worked for both dev & marketing managers).
They said thanks for catching the problem, not to worry. I did good for finding it myself, fixing it, and pushing it live. I was still sweating and shaking. They walk off and later that day marketing manager informs me MSN will refund us for the 45 minutes of clicks.
It took about a month before I felt competent enough to touch our forms again.
On a Solaris box.
Hilarity ensued when we next rebooted it.
This sounds like a simple task, but it gets complicated by the variety of pipe fittings and adapters available. Our sensors are a particular thread type, and we have to find a free slot to install them, and come up with any pipe connection converters necessary to install them there. Another tricky part is that the rig workers who actually know about all of this stuff are often not particularly eager to help out.
So on one particular job, the only free slot to install the sensor is a male pipe fitting, capped with some sort of female plug. Our sensors are male in that pipe size, so I need a female-female adapter to install it. I go looking around and come up with one, not paying too much attention to it. I install it, and everything seems to go more or less smoothly. We go on drilling with this installed for like a week or two.
One day, the rig manager comes to find me and ask me about this adapter that I used. He tells me that it is meant for drinking water lines, and is only rated to 200 psi. And had been installed on a 2000 psi line for weeks. My jaw dropped in shock - I have no idea how that adapter didn't fail, and it's entirely possible it could have hurt or killed somebody if it did.
They sent one of their guys to find an adapter that was actually rated for the pressure and replace it, and never said much else of it. No telling how much trouble I could have been in there if anything else had happened. It did make me a lot more safety-conscious.
I was doing HVAC work while I was in college and we were removing an old air handler from underneath a house. Just inside the crawl space, under the access door was a water pipe. My boss told me to make sure I held it down while we slid the air handler out through the hole. I lost my grip on the pipe and the air handle snapped it in two, at which point gallons of water began to gush into the crawl space.
I ran for all I was worth to the road, which in this case was about 600 feet away, to turn off the water at the water meter. I ran up and down the road in front of the house and never found the water meter. So I ran back to the house and inside and told the homeowner who promptly informed me that they used well water. She called her husband and he told us where to turn off the well pump.
It wasn't really that bad in the grand scheme of things but letting the homeowner's water gush under the house for about 15 minutes does not bode well when you are supposed to be there to fix problems not create them.
Because of me, one December, everyone in the country who went to the cinema got to watch anywhere between 30 and 45 minutes of ads before the main presentation started.
Lesson learned: write more tests, monitor everything.
I run around like a headless chicken trying to find who knows the right backup to use and so forth, and I can't figure out why everyone is so calm and collected about it. Production was down/shit, I hope I still have a job. Turns out we had no active clients at the time - no-one was accessing the site. We'd finished one run and were in 'dead time' before the next. My next project involved implementing coloured prompts and I no longer leave production ssh sessions lying around when I've finished with them.
My CTO still has me listed as "database [vacri]" in his phone...
I now never execute ! commands as root. Actually, nowadays I simply use CTRL-r.
I immediately admitted it and showed everyone the bash history, I was suspended, then fired.
My screwup was at my first "real" job, fresh out of college. I was asked to free up some space from the production server at $BIGCOMPANY, because it was already at 99% capacity (it managed to get to a 100% for a few minutes before I "solved" the problem). The thing is, at this $BIGCOMPANY, for some reason the budget for disk drives was non-existant, and this meant that whenever the disk usage was at or below 95%, we were happy because we still had free space... figure that.
So here I come, armed with the most dangerous tool a newbie can wield... root access and the drive to impress your boss. I said to myself, "I've used root at my home machines plenty of times and nothing bad happened because I've been using Linux for several years by now and I know I need to be careful... so I don't get why everyone says you should never log in as root". Oh boy, how I learned the hard way.
To continue my story, it turns out that the easiest/fastest way to free up some space was to delete the log files for pretty much everything(except the last 5 or 10 logs... because we were "careful", in case we ever needed them). We usually deleted things under certain directories known to hold "useless" logs. So here comes Mr. Newbie-guy-with-the-need-to-shine, and I thought to myself, "why keep deleting the logs from the same directories over and over if that only buys us about 1 or 2 percentage points, instead of cleaning as much logs as possible for the system and freeing up a lot more space?"
After thinking about it for like 10 seconds the most genius thought of my career materializes: do an rm -rf *.log on the topmost level directory of where we used to store everything (webserver, webservices, databases, etc). I happily pressed enter, and a couple of minutes later, hooray! I got the disk usage down to a whooping 90%! I was a hero! that meant we had bought enough time to keep on working without worrying about the disk space for at least another one or one and a half months. This was a clear victory and an testament to my superb sysadmin skills.
Fast forward 4 hours, and the phone starts ringing like crazy as every other employee (non-IT ones) started wondering and then calling us to try to figure out why was their data gone. They did not understand how come they have been working A-OK so far, and then suddenly ALL data from sales team, admin team, the bosses, etc was gone. And then a few minutes later... the whole intranet came down crashing and burning.... then a full stop... nothing was working.
So we went to the logs directory... oops... no logs there!. Ok, let's try to ping the DB. Dead. It's not running and it's responding with an unknown error. When I tried to connect to it would do so, but then some cryptic ORA-xxxxx error came up. No problem says I, I'll just google it and fix it.
Not so fast young grasshopper. That error meant that the DB was out of sync with its own files used for, ironically, data corruption prevention and rollback (or something like that... to this day I still don't fully undersand what those files were used for).
As far as I can remember those logs where a sort of pre-commit place, where all changes would be stored on those files and every X amount of hours the changes would get commited to the actual DB tables. It was some functionality that supposedly was used to correct corrupted entries and to recover (figure that.. ) and rollback data when lost, or something like that. And unfortunately bringing the system back in-sync was way out my league (did I failed to mention that I was by no means a DBA?).
However a struck of good luck came down on me, as the company had a support contract with Oracle and it was the Platinum-covered-diamonds level or something. That meant that after creating a support ticket at like 1AM, I got a call from one of the support guys like less than 20-30 minutes later. This guy seems calm and tells me I should not panick, it was just as easy as doing $crypticOracleStep1, $crypticOracleStep2, $crypticOracleStep3 and voil! all would be good again. Except for the fact that I had NO IDEA what those steps actually required me to do. Almost in tears I ask the rep to pretty please SPELL every command I needed to execute, letter by letter. I did not want to screw up again.
So there I was, at close to 2AM, with my boss breathing down my neck asking me what every frigging letter of the command I was typing did (which I had no idea...), all the while trying to keep up with this supper friendly guy that was patient enough to spell everything two times.
After a couple of commands later, behold! the DB could be brought up again! oh boy, did I felt relieved. I was jumping up and down because I had fixed my stupid mistake... or so I thought. After almost causing the support guy to go deaf due to my loud cheering, he says "however...". wait... what? there's a "however"?!?. Then he continues saying, "since you deleted the pre-commit file of the last day, the DB is back in-sync... up to yesterday". My jaw dropped to the floor. That meant that the ENTIRE previous day was utterly lost.... sales data, contracts, customer's info, etc.
I thanked the guy for his help, hanged up the phone and turned to my boss telling him that I was ready to turn in my resignation letter just after helping capture what availabe data was actually there (in papers, by calling customers and asking them again, etc).
My boss then turns to me and says, don't worry. We've all been through this at least once in our careers. Even I made a mistake that is terribly similar... however when I brought down the database, it took us one full week instead of one day... and rest assured that as I learned my lesson, you did as well. And I need guys like you, that have the initiative to solve things... and the ability to learn from mistakes. So don't worry, you are not losing your job. However you can't go home until you help everyone get as much data back as you can.
Aw shoot... well.. I guess it could've been worse. So after having lunch with my boss and the other teammates at like 6-7 AM, I went to the sales dept and started asking around how could I help them get their data back.
Those were the longest 38 continous work hours I've ever had to resist. I did not go back home until more than a full day and a half later. I was tired as hell to say the least... but to this day I think it was a blessing that I got to learn such a hard lesson but being backed up by a boss that was very cool and progressive about it.
0) Never ever ever ever use root, especially for deleting files and ESPECIALLY with the -f flag.
1) Do not assume that something you know will hold. Confirm it in the particular system you are going to be working with. (i.e. do not assume .log files are always log files because in your laptop that holds true)
2) Be ready and willing to assume the consequences of your actions. Most of the time if you assume responsability for your mistakes, people will forgive you and even give you a piece of advice.
3) Never ever ever ever use root.
It wouldn't be a big deal, wasn't for the fact it was an EC2 instance, and back then halting the instance was equivalent to deleting it permanently. We then spent the night at the office recovering and testing the server. I think we left 3:00 AM that day.
Lesson #1: it's never a good idea to "shutdown -h now" on a shell. any shell.
Lesson #2: have the process to spin up a new production server fully automated and tested
The PM brought the issue to the CTO, but somehow I didn't get fired. Ended up apologizing (obviously a poor choice of words :)) and moved on. Never made that infrastructure change.
Key takeaway: if you're going to talk shit, don't do so in writing. ;)
I lost $7 million dollars in minutes by being short $700 million of US 2yr notes when the levees failed during the hurricane Katrina disaster.
Although my bet that the 2y point would be under pressure in the intermediate term turned out to be true, I got carried out by fund flows as folks spazzed out to cut risk by rolling into short duration high quality paper.
To his credit, my boss, who sat across from me, said only: "wouldn't want to be short 2 years." He let me make the call, which I did, and I covered my position. (Ouch.)
My book was up considerably on the year already, but this was a huge hit, and nearing year-end. I dialed back the risk of my portfolio and traded mostly convex instruments (options) for the remainder of the year.
The lesson from this is pretty obvious. Backup. Make sure your backup is good and safe.
My worst work-related mistake was getting into business with a friend. It cost me the friendship, a very valuable client, and a good portion of my retirement savings. I'm not sure how related it was, but a few years later my (former) friend killed himself.
And the lesson here is not to go into business with friends. Or at least to set up the business as if you're not friends.
1) after few months working in a bank, I was doing some simple admin check task via RDP to a Windows 2003 (no, maybe 2000) server, when I right-clicked the network icon and instead of clicking the properties options i clicked "disable". Just the time to say "oh sh!t" and to realise that it was the production Trading On Line machine, on a remote datacenter, during market hours, and to discover couple of minutes later that the KVM over IP was crappy and was not working. We had to call the datacenter operators to go back to the local KVM and re-enable the NIC.
Lesson 1: Better move slowly when you're on a production machine (and also have plan B and C to reach your machines is a good idea)
2) same bank, one or two years later, I was doing some testing on a new mail system that integrated also VoIP (SIP). Mail/SIP System running in a VM (I think Vmware Server at that time) in the same remote datacenter as above. So, I enable the SIP feature and after few seconds, bum, we lose the whole (production) datacenter and the connection between the local server room and the datacenter.Panic, I look at my colleague, WTF in stereo, everything come back for few sec, bum again down. Long story short, the issue was that that version of Netscreen firewall ScreenOS had a buggy ALG implementation for SIP that lead to core dumps.The fun thing is that we had two of those in HA, same version of course, so they were bouncing between core dumping, rebooting slave becoming master and then core dumping again etc..We had to ask a datacenter operator to reach the rack, disconnect one of the cables from the firewall (the one that was managing the traffic of the DMZ where that machine was hosted) and then reach the virtual host to kill the machine.
Lesson 2: you can segment your network but if everything is connected through the same device(s), sh!t can still hit the fan...
Most servers had those hot swap drive bays for convenient access from the front while the server was running. You only had to make sure no write operation occurred while you pulled the drive out of the bay.
So, I had to exchange a backup disk on a database server running quite a few rather large forums. The server had two disk bays: One for the live hard disk and one for the backup disk. I was absolutely sure at that time which one was the backup disk so I didn't bother to shut down the database server and incur a minimal downtime. Of course, I was wrong and blithely yanked the live disk from the drive bay.
I spent the rest of the night and most of the following day running various MySQL database table repair magic. It worked out surprisingly well but having to admit this error to our forum users was embarrassing, nonetheless.
Lesson: Appropriately label your servers and devices.
# cd /etc # emacs inetd.conf # ls ... ... inetd.conf ... inetd.conf~ ... # rm * ~ # ls # ls
I was off Friday, so I come in Monday morning to see that ~20k customers have been getting free stuff since Thursday lunchtime.
Lost something like $200k because of two nullable columns :(
Queue me 7mos later reviewing the system. Realizing that critical jobs were no longer running and that our users were all essentially receiving 100% free hosting for however much storage they wanted. SOOOO i turned the jobs back on.
The lead engineer before me left no documentation of what the jobs did other than that they should be run. In my stupor i did not review the code. The jobs sent out a blast of emails warning that files would be deleted if not cleaned up or maintained. Then seconds later deleted said files...
We nuked around 70GB worth of files before we realized what happened. WELL GET THE TAPES! Turns out our lead engineer ALSO forgot to follow up w/ system engineers and the backups were pointed at the wrong storage.
No jobs lost, thankfully the manager at the time was a word smith of the highest degree and can play political baseball like a GOD.
Within about 2 minutes CTO strolls in asking about the flood of exception emails due to each request being unable to connect to the database.
Thankfully, I was able to apt-get install mysql-server, all the data was still there, and things were back to normal within 5 minutes.
Me: No we don't. We have 121 bad orders.
Boss: There are thousands of them!
Me: No there aren't. There are exactly 121 of them. I'm sure.
Boss: I'm not going to argue with you!
Me: Good. Because you'd lose.
I fixed 121 orders that night. The next day my login & password wouldn't work.
I list the content of my home directory trying to understand which folder was so big. Then I see it. A folder usually empty. Empty because I use it as generic mount point. A mount point that the day before was attached via sshfs to the production server...
I had a strange feeling, like if I was seeing myself from behind, something crumbling inside me. And at that moment someone start to ask "what's happened to <hostname>"?
I take my courage and I say "I know it"...
That was really hard. The worst day at work in years, and during the last day too. Luckily we had a good enough backup strategy and the damage was mostly solved in a couple hours.
There I realized how much of an idiot I was to have mounted the production server on my home and I grow a little.
1. I passed the resume and chat portion
2. I passed the telephone questionnaire and got along great with the interviewer
3. (Fail) I scheduled my interview on a Friday at 4:30pm and there is a 30 min travel time. I left 1hr early...still it was Memorial Day weekend, so I thought the streets would be quicker than the freeway since it was at a stand still. I was so stressed that I literally had an anxiety attack and couldn't even find the address. Never happened to me before, so I'll never forget it.
We had the tech support contract for the city's Mexican consulate. One of the things we were doing was patching and updating their server and installing a tape drive backup system. Server was NT4.
I'm in there doing work after 5pm, and wrongly assume that everyone's gone home for the day. Install some patches and the server asks me if I want to reboot. I say yes. Few moments later, a guy sticks his head into the server room and asks if I'd shut down or rebooted the server. Oh, whoops, someone's here. Yeah, I just installed some patches. Oh, OK, see ya.
Next day? Turns out he had been doing some work in their database where they track and manage visa applications. That database got corrupted when I did the server reboot while he was doing his work. That night, the backup process then overwrote the previous good copy database on the tape drive with the newly corrupted database. We had not yet started rolling over multiple tapes to prevent backups of corrupt data, though we were going to purchase some tapes for that purpose shortly.
Summer was ending, and I quit a week later to return to school. Horrible timing in terms of quitting! No idea what happened after that, as I was spending the summer in a city that was not my own. I do know that the original database developer contractor was on vacation at the time and so they couldn't reach him. I think the consulate was SOL. I regret rebooting that server without checking if anyone was working to this day.
Lesson learned? Don't assume anything when doing anything. Carried that lesson with me for the rest of my life. And find a boss who knows how to guide you if you don't have much experience in your area. I guess for founding startups, at least get an advisor.
So one of the first things I wanted to do was setup a development db for which I exported the structure from their prod db. I then proceeded to change the name of the create database statement at the top to the new dev db I wanted and ran the script.
Unfortunately the prod db name was still pretended to every drop and create table command in the script so I had just replaced their whole prod db with an empty one.
Owning up to that was one of the most embarrassing moments of my career. It was such a rookie mistake I just wanted to die. Luckily they had daily backups so I only cost their 4 man business about half a day of work but... it was enough for me to be a much more careful developer from that day forward!
me: "unix definitely won't just let me cat /dev/urandom > /dev/sda"
other: "sure it will"
me: <presses enter>
what I learned? unix will absolutely let you hang yourself. 1998, production server for a fortune 5 company.
Thankfully, someone stopped me before I turned it on.
I'm the DBA.
We were rearranging the layout of the office. Coworker was moving in to his new space, setting up his desk. He boots up his computer, wonders why he has no network. Looks around, discovers the ethernet cable isn't plugged in. Plugs it in to the wall, still has no network.
A few minutes pass, and the entire office is running around wondering why the hell the network isn't working. Maybe an hour passes, the network guys are losing their shit trying to hunt down what is wrong. I'll give you a hint: the router was lit up like a Christmas tree, and the aforementioned coworker had both ends of his ethernet cable plugged in--but neither end was attached to his computer.
All in, took 4 days and a new server where the hard drive had stored bad pages on the DB. We lost 2 days of orders (they were processed through to the internal systems though so not really lost)
Lesson learned, validate backups and check page integrity when backing up
Lesson: Keep the code that touches production databases as simple as possible so it's easy to verify exactly what it does. I was using a framework's database tooling incorrectly because I never dreamed what I used would touch the databases's counters.
(Not my worst mistake in terms of people affected, but it's the only mistake that was literally laser etched in metal forever.)
Raw unadulterated fear followed by panic.
A full reinstall.
Triple checked dd params ever since.
"Shit. Well, as I have just demonstrated, it becomes possible to wipe out a million user login credentials at the touch of a button. So now we'll be needing to restore that from the backups which we don't have." Luckily, and ONLY BY CHANCE, I happened to have a copy of that table exported for other reasons from a few days back.
Lessons learned: Never press enter.
1. First day at a job. I need to get familiar with a legacy system and get a SQL dump from it to create a local copy of the database. After some SSHing and MySQLing, I confuse my two split terminal panes and end up importing my local dump to production server. Of course database names and users were the same so I end up dropping the database. No biggie. Backups were available from previous day.
2. Similar story to the first one. I got a new shiny Zend Studio IDE. Want to set up sync with remote server (just a static company website with no version control). Fill all the settings, press the sync button - and what happens? Zend Studio somehow figured that I want to force sync my local folder, which is empty, to the remote site, and it just deletes everything on the web root and uploads my empty folder. Wat. Should have read the settings twice.
They had ASIC design runs for research purposes once every three months, yielding your design on Silicon as ten 6" wafers. It gives enough parts for testing the first revision of your design. The person was carrying the wafers to a vendor for cutting into separate ICs and packaging or something. Gets to the parking lot, and where are the keys. Puts the wafers on the top of the car, finds the keys in his pockets and starts driving. Boom, the box of wafers was still on the top of the car, now on the ground. All broken. Some $100K in wafers + three months lost + bad face before the customer + ... Lesson: Don't put stuff on the top of the car!
edit: This was after I asked for permission to do this.
Lesson learned: Don't EVER use Coldfusion as a web server.
A friend had referred me for a sysadmin job opening at a web hosting company in Florida. After a brief interview I got the job for a pretty decent salary and was told when I could start. What they hadn't told me was that my schedule would be tuesday to saturday. I had informed the hiring manager of my preferred schedule (monday-friday), but I guess nobody mentioned it to the manager of the group.
When I got there they told me my schedule and I immediately told them that's not what I signed up for. So they asked me to sit for a while so they could figure out what to do next. I took a tour of the NOC, and saw one of their tier 1 technicians was chatting and watching a movie. I walked up and asked him "Heyya! Workin' hard, or hardly workin'?" and smiled. He did not smile back. So I went back to the desk I was assigned to, which was already logged in - with the credentials of the previous admin.
While I waited I decided to see what other trouble I could get into. Sure enough, all the old passwords were saved in the old admin's browser with no master password. I couldn't copy-paste the list, so I took a screenshot and began to find a way to print the list out to post on my cube wall. Before I could finish I was asked to leave for the day while they figured out my schedule changes. I should have gotten the hint when they asked me to leave the badge there.
Later I got a voicemail telling me they'd pay me for the time I spent there (about three hours) and they'd no longer require my services. Luckily I got hired soon after to a different company, which was also hiring away all the talented people from the place that had let me go, and the web hosting company eventually went under. So it turned out to be a good thing in the end.
Users were mapped into specific silos to separate out each level of the stack from CDN to storage to db. There was a bit of code executed at the beginning of each request that figured out if a request was on the proper subdomain for the resource being requested.
This was a feature that was always tricky to test, and when I joined the codebase didn't have any real automated tests at all. We were on a deploy schedule of every morning, first thing (or earlier, sometimes as early as 4am local time).
By the time the code made it out to all the servers, the ops team was calling frantically saying the power load on the strips and at the distribution point was near critical.
What happened: the code caused every user (well upwards of millions daily) to enter an infinite redirect, very quickly DoSing our servers. It took a second to realize where the problem was, but I quickly committed the fix and the issue was resolved.
Why it happened: a pretty simple string comparison was being done improperly, the fix was at most 1 line (I can't remember the exact fix). There was no automation, and testing it was difficult enough that we just didn't test it.
What I learned: If its complicated enough to not want to test using a browser, at least always build automation to test your assumptions. Or have some damn tests period. We built a procedure for testing those silos with a real browser as well.
I got a good bit of teasing for nearly burning down the datacenter on my very first code deploy, but ever since, its been assumed that if its your first deploy, you're going to break something. Its a rite of passage.
Thoughtful pause "Why is this taking so long!?"
Unfortunately I got my selection criteria wrong and pulled out all of one cluster and half of a second, halting a few thousand operations.
Luckily the monitoring system was very quick to alert me of this and using the same (wrong) selection criteria it was a fairly simple process to stop the update and put them all back in the cluster.
Takeaways?The age old cliche of "With great power comes great responsibility". Oh and have good monitoring!
I left that job about 3 years later when the metaphorical train stopped at a nicer place. My name is still known in certain circles for this ["Oh bah, how could I forget?" one former manager recently stated], but I don't plan to go back there at this time.
I learned that life's too short for assholes and working in an environment you don't like. If you don't screw up, your soul will die and you'll become that former coworker you hated so much and who hated you in return. It's worth picking and choosing where you work.
Possible outcomes of unplanned system haults include plugged machinery that would need to be manually cleared, mixed products which would become immediate net losses for the company and damaged motors.
Thankfully no product was being run at the time. I have also implemented changes across the board to our client sites that prevent this type of shit from ever happening again. You know when you look at a system and go "this is going to bite us in the ass eventually?" This was one of those systems, they just needed a new hire to give them the push.
Luckily, there was no slap on the wrist or anything, the store manager knew that after doing thousands of these cards this was only one of a few slip ups I've made so they just brushed it off and moved on.
I have no idea why they didn't use UPS, but it took many critical servers offline and caused a few hours of headaches for everyone.
Come to think of it, that was the last time I was allowed in the server room.
Lessons learned - don't let developers in the server room.
As it turned out the only data that did go out was the single word "sheep" in the search index.
I'm still not sure how this bug slipped past the bank's tough app certification process, though.
I had to quickly get a patch in for the improper code and had to maintain that buggy implementation. In addition, the "standard" itself got a rather scathing write up from Peter Gutmann, which is completely valid:
This is a critique on the "standard" itself, the process was just as ugly.
Poof. Equipment electronics fried and useless.
I was chewed out. Could have been way worse.
Follow your safety procedures.
The next morning, the bunker was full to ground level and the automatic power cutoff had failed, as the float switch was directly under the cable duct and the water pressure of the deluge and kept the float depressed. By the time the water stopped flowing the float was under a foot of mud. The powered circuits were undergoing electrolysis and eating themselves away, made worse the the site managers refusing to drain the bunker or turn off the power until a week long arse-covering evaluation had been completed.
A few hundred million dollars of front line radar was out of action for several months.
Being a naive newly graduated engineer, I wrote a completely honest report and analysis. My boss said it was one of the best reports he had read and there was no impact on my career (if anything it got me noticed by the upper echelons of the organisation).
1. If you tell the truth you will be respected, even if it is incriminating.
2. If there is a way for something to go wrong it can do so (slight variation of Murphy's Law). Even if it's judged to be uneconomic to take preventative action, be aware of the possibilities, so you can make a conscious decision about the risk.
9 hours later I wake up to check my inbox has 800+ emails. Django by default sends out email when an error occurs and a tiny mistake of not installing a package led to a lot of frustrated customers and well a huge pile of email in my inbox !
Moral of the story: Put that pip freeze > requirements.txt and pip install -r requirements.txt into your deployment flow.
But this was one of my first. Years ago, making boot floppies for a physics lab where I was reinstalling all the servers:
I meant: dd if=/dev/zero of=/dev/fd0
I did: dd if=/dev/zero of=/def/hda
Oops. Bye, partition table.
(Always double-check everything you type as root.)
I was a stock analyst, for a firm with dozens of institutional salesmen and thousands of retail brokers. Some of my recommendations were very, very wrong.
The right thing to do is stand up, take the heat, and explain what you now know as best you can. I learned that watching a colleague who I thought was otherwise an unserious ass.
Luckily we had backups from that morning so we only lost any address updates people would have done that day, but it made for some interesting customer service calls for awhile...
Takeaway: Sometimes, it takes a disaster to realize you were in another disaster anyways.
Every database alias I have now has the MySQL --i-am-a-dummy flag appended. This has been a career-saver in my eyes.
EDIT: I proposed a new password of: @$tevezA$$ignedPwD@# (Steve's Assigned Password)
He said no to that one.
Perhaps the only lesson is "slow down."
learning from this experience: never do an "rm" on the log file, instead do "truncate -s 0" on the log file.
Some how in settings, we had flag "Store Credit Card Info" as "Plain Text" enabled. The Admin/Staff of that client could have use this information to make transactions (As in Backend it would show Full CC info into order details)
We didnt realized untill we worked on it again for some bug fixes and adding new features.
Lesson Learned :- When transitioning from DEV to PROD env, make sure to check all these critical flags and correctly set
Luckily, the client didnt had any idea about what was wrong in backend.
now on to my tasks.. had some files to print out. Where did they g...... FUCK.
I found a box of tapes and some sunos manuals. Spent the next several hours figuring out how tar and tape drives worked. Got everything back. Never told a soul.
1992. I've never done anything so careless since.
Two months later, the certs were expiring soon and we changed our configuration to something Android liked by default. The bad news was that our production Android app rejected the new configuration and only wanted to accept the current certs.
We ended up quickly shipping a hotfix that accepted the current and upcoming configuration a few days before the certs expired. There technically wasn't any 'downtime' as long as users updated the app, but this all took place right before 'holiday vacations', and the QA team had to test the fix while all the devs were away.
Oh yea, I run a proprietary trading firm (still at the same spot), as a result of that bug we went down and lost about $250k over the next few hours. Testing is important in automated trading :)
It failed anyway, but I wasn't around when it did and there would have been no "I told you so" credit even if I were.
One of those "big company" lessons, but probably applicable to startups (which have an even higher ego density).
Said person entered the number of metric tons of concrete 3 magnitudes higher than it should have been. Imagine the cost difference between 1.0 * 10^6 and 1.0 * 10^9 metric tons... Our boss was not pleased, to say the least.
But imagine how easy it is to enter a few extra zeros in an excel data cell. Yikes!
Reaction was standard: mostly to point out I did my best in unfamiliar territory and things should be sorted soon.
Take aways were: (1) less support calls than expected - users put up with things. (2) you learn when you fail (3) always have a backup
They kept me on at that job but I left pretty soon anyway as I got a 'real' (as in creative) job hacking perl-powered VPN modules for those Cobalt Raq/Qube devices, and building a Linux-related online retail venture for the same employer ... that worked great, but failed commercially.
After the potatoes are peeled and washed they are run through a pipe with blades to slice the potatoes into french fries. These blades are sharpened with lasers and are insanely sharp because they need to cut a lot of potatoes before being changed.
One day they were shutdown and it was time to change the blades. The lady doing the change placed the new blades on the table and bumped the table when she turned to grab a wrench from her toolbox. The new blades started to fall and she instinctively reached out to grab them to prevent them from falling to the floor.
She ended up not grabbing anything because the blades sliced her fingers clean off. They took her to the hospital and due to the blades extreme sharpness, the cut was so clean reattachment was a pretty easy procedure. I don't know if she had any long-term negative effects from the incident.
Safety is important, be aware of your surroundings and don't instinctively grab things you shouldn't be touching in the first place.
Client, not happy.
I was working at a startup that was trying to create an affordable 3D printer. We had two working prototypes that were used for everything - demos, print testing, software testing, PR shoots, everything. Each prototype had cost hundreds of man hours to build and debug and quite a bit of cash as well.
Among other things I had done all the work on the thermal control system for the printer, it kept the print heads and build chamber at the correct temperature. One night while working on one of the printers I hit an edge case that my control code didn't handle well and the printer turned all of the heaters on full-bore. Half an hour all the plastics in the prototype had either melted or burned and I was left with a room full of smoke and a pile of scrap aluminum.
Learned: Learning on the job as you hack away on problems is great, but recognize that it's one part enthusiasm and one part risk management. Also learned to never try anything on the command line that wouldn't want to see pulled from my bash history and stuck on the breakroom fridge. Also learned to cope with humiliation well.
What could possibly go wrong?
Gratitude is demonstrated through actions, not vague verbal commitments.
There's some money going into making better Litecoin/Scrypt miners, which is currently only gpus so that's positive.
The question you should be asking is how will Bitcoin advance computing. That question I think will get you a lot more answers:
- Security is essential. People require secure computing to safely operating in Bitcoin.
Those are two VERY different questions. I came here expecting to answer not to expect to develop for 30 years.
Ok, now for my answer. There are a few things you'll want to look at.
First, I'd argue you build very modularly. You don't know what the product is going to need to do in 5 years, let alone 30 years. Building a product from independent modules will allow future developers to add, remove or update modules to reflect the times.
As far as languages, go with either what is popular today, or what you have EVIDENCE will be popular in the next 10 years. I say evidence because if you just hop on the next big thing (Go, Julia, etc) because at this point you don't know how difficult it will be to hire developers that are familiar with those languages in 10 years.
The libraries and dependencies answer is the same as the previous. Don't pick something obscure that you think 'might' be popular in the long-term, go with what's popular now or has evidence of popularity later. jQuery, Bootstrap, Rails, Node.js all had massive amounts of interest right from the start. However, if you can get away with not relying on a library, you should probably do it. If you need something now to ship quickly, but don't want it populating your project long-term, use it only in the modules that need it, and that way it can be easily replaced later (I'm doing this with jQuery now as Angular DOM traversal isn't great.. yet).
Seriously, comment the hell out of that code where anybody coming in at any time can understand what's happening. Make it so that a technical director can read only the comments and understand the entire system and it's dependencies.
If you are not in your office much, then I would just offer the GPG option. Sending 20 e-mails will not take very long. Out of the 20 students, I bet only half of them get their act together and e-mail you a key, so it's really more like 10 emails that you would need to send.
Even though it is terrible I would just stick with the schools system.
Everybody would then attempt to decode each cipher, with only one working for any individual private key.
This isn't all that different from your original posting, except that you now only need to send one unique e-mail.
(For Beavis, who's getting an F because he never showed up to class, you might get in trouble with the administration because a simpleton couldn't decode their grade through this or other sophisticated means.)
That said, encrypt the each grade with a key derived from the students ID (which is privileged information) and make a webpage to do the decryption for the students. SHA256 ( ID + Salt ) == Key for symmetric encryption.
--former IT college staffer
They have access to SHA256 so they can privately find their own student ID hash and then look up their grades
Another option is set up a website with a login id (student id or something) and have students submit a pin (4-12 chars) and let them use that to login and see their grades (probably should be ssl).
As for your question I don't see how you could send them encrypted, you could make up random ids for each student and only give that student their id then send grades out corresponding to their random ids but that may still violate privacy issues.
I could be out of my league here and you may have other reasons, but you could just run VM's. Vagrant/Virtualbox takes all the pain away.
cmd 'app add site Test' - this would create the vhost file, put it in sites-enabled with the doc root.
If you have the money, know at least 1 scripting language, and have an aptitude for technology, the OCSP certification course is pretty good.
If you want to go the cheaper route, there are lots of books. One introductory text a lot of people like is Hacking: the Art of Exploitation.
If you want to learn about web security, the Web Application Hacker's Handbook is a great book. For something less intensive, The Tangled Web would suffice.
If you want to learn to harden Linux servers, reedit.com/r/linuxadmin, /r/linux and /r/linux4noobs are great resources. Before you post questions, however, I suggest using the search function because lots of people ask for hardening guides.
Black hats generally network on IRC. You sit on some public IRC channel, build rapport , and eventually get invited to private channels.
There are plenty of resources out there on how to harden your server and reduce attack surfaces. You just need to spend more time familiarizing yourself with the landscape and quantify your actual goals.
There is a plethora of IRC channels, forums, mailing lists and whatnot where people share that kind of stuff. Frankly, a bug report is something like sharing it, before its fixes it is a zero day exploit.
Managing workers and segmenting tasks = project management
both are available as fields of study and are defined roles in most businesses -
Product Manager: You take business goals and turn them into technical requirements, then ensure the product gets built.
Project Manager: You take technical requirements and ensure that your team delivers on those requirements.
User Experience Designer: You create mockups for products which can be implemented by the technical team.
I'd put together a portfolio of your work and send it off to a few companies, as that might be good enough to land you a job.
For many reverse engineering projects, assembly might be a wholly uselss skill, since whatever you are looking at is actually MSIL or running on Python with its own embedded interpreter. Here assembly only serves you to quickly tell you would be wasting your time :)
Personally my favourites are 6502 (http://skilldrick.github.io/easy6502/) and 68k (http://www.easy68k.com/) tho' neither of these are realistically of any commercial use.
Wonderful book from which a lot of knowledge is applicable to other architectures straight away. It teaches you about planning, control structure implementation and the maths behind it all as well.
That way you will learn what it is the computer is trying to do, and how constraints on how it is built change that.
Then I'd suggest some cheap 8 bit Microprocessors like the AVR series and the PIC series from Atmel and Microchip respectively, (the AVR has solid C support so its probably a better single choice, but the PIC has weirdness associated with architecture constraints which is good to understand as well).
Once you are a pro writing AVR assembly code, then grab a copy of x86 assembly and a description of the Pentium architecture. To do it proper justice start with an 8086 assembly book, then a 286 assembly book, then a 386 one, and finally a Pentium one. That will let you see how the architecture evolved to deal with the availability of transistors.
Making trial version complete and so on. Some times it was really easy(just finding a jmp and changing it), other times we had to compare with the complete program, finding code blocks,patching the trial and making all checksums and stuff to work.
None of the software that we cracked was released to the public, it was just for fun.
At the time there was little exercises called "crackme" for exercising your abilities.
It takes at least over a year of work to start being really good at this, and is not like Obj.C, Java or Python, or even c, but way more tedious. Without having friends on this and clear objectives I would had found it boring.
It would be probably a better idea to buy a micro processor and code simple things in assembly, like blinking LEDs.
Of course, to do that, you need to find the manual for your machine architecture. The x86 manuals are, for example, available here:
You also then start to notice things like the operating system specific application binary interfaces (ABI):
and object file formats such as ELF that's used in Linux:
or Mach-O used in Mac OS X:
You can also do the same thing with the JVM and look at its JIT-generated machine code with the '-XX:+PrintCompilation' option:
Its focus is actually writing assembly on an acutal computer, with the goal of implementing a snake game.
The tutorial has extensive coverage of interfacing assembly and C code and so might be of interest to C programmers who want to learn about how C works under the hood. All the examples use the free NASM (Netwide) assembler. The tutorial only covers programming under 32-bit protected mode and requires a 32-bit protected mode compiler.
1 - http://www.charlespetzold.com/code/
The first fascicle is a free download and the place to start.
2. Knowing how the microprocessor works comes really handy while coding assembly as you can't 'catch exceptions' out there. It is like treading a land-mined area and nothing can replace the knowledge of the fundamental terrain- the architecture.
3. Since you know C, you can start with some serious gdb usage, as mentioned by @penberg.
4. Then find your sweet spot between these two ends. You could start with embedded robotics, another viable hobby could be IoT application. Two added advantages of these over 'theoretical' assembly language learning are that-
a) You are doing something with a real-scenario implementation, so you're surely hooked.
b) You can eventually mold a business model around it if you end up with something really innovative.
IDA Pro is the industry standard for reverse engineering but it also is expensive (like USD $2k). There is a free version but it doesn't offer 64bit, so not really an option for modern ObjC or Intel computers. As you've mentioned ObjC chances are you work on OS X. IDA pro is not working well on OS X (the recommended way is to use the Windows version via virtualbox and not the OS X version). Still, Hopper.app is a great alternative on OS X. Not as good as IDA, but it has a Python interface, GDB support, and decompile support for ARM, Intel (and some knowledge regarding Objc). And it's only ~USD$100. [There is also a Windows version of hopper.app but it seems not yet ready to use, as I've only heard bad things about it there so far.]
Introductory Book: http://www.amazon.com/x/dp/0763772232/
The key is to choose a project that you are excited about. If you pick another blah assembly tutorial, without the excitement of a project pushing you, your enthusiasm will evaporate sooner or later.
It also depends how steep of a learning curve you want to encounter. I, personally, have not yet played with x86 assembly because the documentation for them is so unfriendly for beginners. To that end, when I want to play around in Assembly and learn techniques for that level of programming, I usually play with the DCPU (http://dcpu.com/dcpu-16/). It's fake and was designed for a (sadly) not-to-be-made game. But it is an absolute joy to program in.
Play around with that until you're comfortable and THEN tackle x86.
I would also grab a copy of Art of Assembly Language.
 http://twitter.com/N_is_stolen http://twitter.com/N
Don't they have people who monitor this stuff?
So yes, I've clicked on them. But not very often. But certainly "look at" more than I actually "click on", so there is still some value in these to some degree even if people don't click on them.
The point is that the size of a word means something. A large word is typically intended to indicate, "Here is something that his person is talking about, that a lot of other people are also talking about. Click here to see what others have to say."
> Can someone explain why these became popular?
I suppose this happened, in part, because they seemed to have some reasonable-sounding theory behind them (see above).
The cool factor was probably more important, though. UI trends often ignore usability issues, after all, and tag clouds are an automatically generated example of the kind of "messy" art that became popular a decade or two ago.
My opinion is that tag clouds are better served as art than as functional UI elements...
To put it another way, if you don't know what you really want to do with your life, it's generally a good idea to have as high-paying a job as possible. This way, you can get paid to figure out your life. Also it helps to have some money in the bank when and if you do figure out what you want to do.
Wanna be an actor? Well, now you can afford an agent. Wanna be a writer? Well now you can afford to go to conferences and fly around the country. Wanna be a musician? Well know you have some cushion to tour the country.
Basically, if you don't know what your passion is, then just keep working at your high-paying job until you figure it out. It's not worth just sitting around doing basically nothing.
The things you take for granted like being able to pay for shit, may not be the case once you figure out and pursue your passion. You may want to purchase a membership to some exclusive writers club(if you want to become a writer) but find that you no longer have the funds because you quite your high-paying job. I would say figure out a way to pursue your passion smartly so that you're not left completely broke.
Also, if you're not already doing it on the side in some capacity, then I would definitely recommend NOT quitting your day job. It's a passion. If you're truly passionate about it, then it should be something you're already doing.
About 4 years from now my wife an I are leaving the corporate world forever with all of the loans paid off and enough in the 401k's to never worry about retirement again. We will work at what we love (outdoor guiding and photography) and will be significantly happier for it.
Engineering and tech are great for some people. I'm glad I looked elsewhere to find a passion though.
My computer would probably get an upgrade or two tho.
All I need from life from here on out is to be able to code, and enough income to sustain me. I have no dreams of owning my own company or getting rich. So I am pretty close to my ideal existence :)
The secret of success in an ability to recognize and follow the passion moment to moment.
That is the big part of the secret of "living in the now".
But the common thread between all the passions I've had is that they involve solving problems creatively and aiming for beauty in the end result.
I was exposed to web programming early age, built my first Geocities website that served pirated movies, at the age of 11. After finishing the website, I noticed a lot of people actually came to my site, wrote thank you notes in my shitty, Geocities guestbook.
Being 11 years old, seeing those thank you notes really encouraged me to move forward, and bought my first PHP book.
Ever since, I started building websites and products to reach out to the users. There were lots of failures, actually, most were failures, but that didn't stop me from moving forward.
Passion is something that you find value in doing. For me, it's not about the money, but it's about seeing those thank you notes for providing value to my customers.
In 2001 I got my first taste of the internet when my mom brought home a crappy laptop and connected it via dial up. I was 6 or 7 and got hooked playing chess on yahoo games.
Ever sense I've been a power user spending at least 8+ hours online a day and loving it!
Started teaching myself to code in high-school.(4-5 hours a day easy throwaway online classes == great way to spend senior year) After I graduated last year my average time spent in front of a computer climbed to summer vacation levels of around 16+ hours a day.
Little bit of addiction, whole lot of passion.
Thanks for all the emails so far - I hope you get something interesting out of it, and I look forward to your comments.
Github - $50.00 / month
Hoovers.com - $99.00 / month
Random AWS EC2 time here and there for demos or experimenting - varies, usually about $0.00 / month, but has been $40-50 a couple of times.
Co-working at Underground @ Main (Durham) - $199.00 / month
Mixergy subscription - I forget.. $20-25 / month or thereabouts, I think.
I prefer the Vi mode, though. Add to your .bashrc
set -o vi
Then you can press escape to go from input mode to normal mode; there k will take you to the previous line in command line history, j to the next line, ^ and $ to the beginning and end of the line, /something will search something back.
Editing is really fast; move by words with w (forward) and b (backward), do cw to replace a word, r to replace a letter, i to go back to input. It will remember the last editing command, just as Vi, and repeat it when you press . in normal mode.
Sometimes Excel when diagram could be made by drawing borders around cells and resizing rows and columns as per need. This helps when unplanned non-linear horizontal and vertical scaling is needed while making the diagram.
The SmartArt concept they introduced was quite promising, though I find the current state of it lagging. I am sad that it never picked up.
The downside of course is that it is not free.
I would also like to know open source alternatives to MS Visio. I can see that Ubuntu 12.04 comes with Libre Draw, but I haven't tried it yet.
Otherwise there is GIMP too.
+1 to all the other open source solutions mentioned here.
Integrated speakers (mostly in the bathroom, but through out the house and independently controllable would be nice).
This one just came into my mind today, but having a weather proof mic outside would be amazing, especially in conjunction with the built in speakers. I would love to have a natural rain/thunderstorm/birds played throughout my house (echoed from outside). Sure, windows are great but how often can you actually have them open.
* Locks (windows & doors)
The one other primary feature I would want is an instant facetime type setup between each tablet so that you can communicate with others in the house without having to shout or get up off my lazy arse.
I'd want a single closet/small room where I could put all the equipment for the media/entertainment, cable boxes etc. Then use IR extenders or better yet, one of those wireless remote control systems.
I'd do speakers in the ceilings/walls of most every room/area with zones and volume controls in them. Depending on the square footage, you may need multiple receivers to make it work nicely where everyone can listen to different tunes. Also, outdoor tunes have to be available too.
Wire the house for both wired network and of course wifi. Depending on budget and size of the house, fiber would be nice for at least interconnecting sections of the house.
Along the idea of the wireless remote system, turn an iPad into the house controller. Make life as easy as possible, something you could hand to your grand/parents and they would be able to push buttons and make it work. I have seen systems like this and drool at how nice it is, and it isn't like it is crazy expensive. No more 4 remotes or a "single" remote that works 95% of the way but takes a small training session to even turn on the TV.
Network drops everywhere. Even in the ceilings of major rooms. 802.11N is great, but nothing trumps Cat5E over fiber.
In wall (or ceiling) speakers. atleast 1 in every room, 1 in the 2nd floor hallways. All with volume controls. All wired to a central network closet with multiple Airport Express inputs so the wife scan stream 1 music to the bedroom when shes dressing, and I can stream another station to the family room while I'm waiting.
Network closet should span 2 floors with future pipes into the attic and into the basement for new drops. Network closet is preferably close to the main family room TV for major components. Switches, routers, firewalls (i was a sys admin in a past life) can all go in here. Money willing, put network equipment in 2nd floor closet, tv equipment in first floor closet.
The only real problem is when I'm programming or furthering my tech skills I feel like I'm short changing the social side. Same thing going the other direction. As I'm getting more comfortable with the tension between the two modes I'm feeling good about my potential as a consultant.
- I can't stop something I don't like about myself merely by being aware of it.
I have practically no chance of getting a job in software development or something similar.
I'm prone to being misquoted, which is probably worse than being misunderstood (I learnt I was prone to being misunderstood when I was about six).
I really enjoy hideous data structures.
They worked remotely (and still do) and have found success.
The only problem with your idea and the rest of the bootstrapped remote successful companies is that they were addressing a real customer need/problem and you are simply trying to address your situation of hating your 9-5.
Come up with a great idea, open your world to devs from everywhere, work your ass off initially and then build the company in a way in which that 9-5 doesn't feel like work anymore and can be done at 10-6 or 8-12 + 2-6.
I am involved with a couple of active projects. They are games targeting Asia Pacific market. You can take a look, http://188.8.131.52/bz/about.cgi
It is not much a way to escape 9-5, more of a place where like-minded people get together and build something interesting. If it pays off financially, it is even better.
I think the defaults should be "all" (not story) and "forever" (not past week). I searched for something that I knew existed and when it didn't come up, my first thought was that the search was broken. Better to show everything at first and let the user decide how to narrow it down if they want to.
Second, I think you'd be better off if the design looked like HN itself, especially the comment rendering. Not that what you've got is bad, but the user is almost always going to go from HN to search and back to HN. This context switch is already a speed bump. Visual changes, which take time and mental energy to process, add to that jolt. Anything you can to do minimize these cognitive hurdles will serve the #1 goal, which is to get the reader the info they're looking for with minimum overhead. In this respect, the old HN Search is more usable, precisely for being unoriginal.
Thus, if I were you, I would drop the thumbnail pictures of the stories (it's cool that you can do it, but they don't really add anything and are distracting); would not include the story info with each coment (rather, I would do just like HN does and have a bit of text that says "| on: the-story-title-linked-here"), would make the text rendering look much like HN, and would follow HN's lead in having a text- and information-density-centric design. I'm not saying that HN's design is the global optimum (though I think it is better than most attempts to improve on it), but rather that HN search is an extension of HN and therefore not the place to innovate on its design. You're in a counterintuitive position for a startup with this project, since calling too much attention to yourself in this context is bad. You want to be unobtrusive and have the thing just work; the HN community is smart enough to figure out who you are from that and like you better for it. (That said, you shouldn't obliterate yourselves. For example, I like the visual cues that say "Page 1 of 10, got 237 results in 3 ms" and "1,244,896 stories and 5,289,181 comments indexed". They are unobtrusive and impressive.)
Lastly, a bug in Chrome: if I search for something, scroll to the bottom, click on "about", then hit the back button, the original results page freezes (i.e. refuses to scroll).
1. Is there a way to force off the typo correction? I tried "smartos" both with and without quotes, and either way I get results for the word "smartest", which aren't about SmartOS (an operating system derived from OpenSolaris).
2. In terms of CSS/font/layout, honestly I like hnsearch.com's results better. I think part of it is that I use hnsearch.com as an alternative HN interface, not just a search engine, and it's usable for that purpose, in part because it looks more like regular HN.
The issue for me more generally is that hnsearch.com is almost perfect as an HN search engine, as far as I'm concerned. It does what I want, does it fast, with good coverage and a usable interface. So my advice for alternatives would tend towards just "yeah, make it more like that", which is maybe not the most useful commentary.
2. Having a date filter is better than none but I STRONGLY preferred the old one where you can sort by descending order from most recent to oldest. The current date filter still leaves things out of order even if it restricts time frame trying to mimic Google's filter.
3. Exact match search doesn't work either. It currently works similar to phrase match, where as long as the keywords exist, that will show results rather than in the exact order specified. Perhaps create two separate ways to do this kind of search.
Also, one bug. If text in a URL gets highlighted and it's linked, the link URL itself picks up the EM tags.
If I understand correctly, it is for enterprise search, not for web search, correct?
The response time is really impressive, especially with the new sorting by date function. To me, using NoSQL database, it's hard to do sorting if it's not impossible. That's why it's not available in the initial current release.
This implementation reminded me about the CTF algorithm, which needs to match the input query against a file. The reason why I thought it's not for large volumes of queries is because:
2. After receiving the request, on server side, there are only three steps to go before the results are returned:
- Initialized a new index (actually it's not a new index, just a new search)
- set criteria for the order of the attributes sets
- Call search and return
Of course, there are more detailed steps under the search api. The majority of the work is done on the search engine server side to keep crawling and updating the indexes.
When the request volume becomes huge, the real time response may be slow down due to the file size.
I appreciate this tool to help me find things quickly on HN.
1) www.hnsearch.com had three sort options: relevance | date | points. It would be great if the new search also have all three options.
2) Please make your legacy style exactly like the old one. That style matched HN style perfectly. Right now there is an extra line which links to the HN thread (we are used to clicking the comments link for that) and the way comments are displayed feels not right.
The response time is actually really impressive!
1. Registry has a deal with registrars to sell premium domains at inflated premium rates (eg .tv domains)
2. Once sunrise period for TM holders is over, they are opening the registry and all the registrars offering pre-registration are treating it like the expired domain drops and hammering the registry to try and secure domains the millisecond the registry opens up. People bid/pay registrars to do this on their behalf.
Multiple sites offer pre-registration for domains, is it a first-in-best-dressed situation? Is it a gamble?Should you pre-register with multiple registrars?