Web programming for programmers who aren't web programmers
So I'm thinking about learning some web programming stuff just for the heck of it -- partly to just branch out my overall programming expertise and partly to help my wife out when she works on web designs (she's a graphic/web designer, not a programmer). However, I am a systems programmer, not a web programmer. Most of what I work on is lower-level stuff in C and C++ (with a little bit of Java on occasion) and, in general, once you install the software I write you will only notice it's there when it's not working and it tends to be background server stuff.
Anyway, learning the languages typically used for web programming isn't an issue -- I'm already reasonably comfortable with Python and can pick up JavaScript (probably with a good browser abstraction library like Jquery or something), PHP, or whatever other languages I need to know without too much difficulty. I'm also pretty handy at setting up a basic Linux server and can also pick up anything else I need to know for a non-stock web server installation without that much difficulty either.
What I'm mostly interested in is a guide to what to do and what not to do for the beginning web programmer -- best practices, things to avoid, tips for avoiding potential security issues, and so on. Ideally, I'd want something aimed at someone who already knows programming. Any recommendations or pearls of wisdom from your own experiences?
Comments
The only two major security things to watch out for are injection attacks and cross site scripting/request forgeries.
For performance the key is mostly to reduce the number of SQL queries per page load and make those queries as fast as possible. The database is almost always the bottleneck unless you do something weird.
I guess I'll be reading up on injection and xss attacks/forgeries to find out how they tend to slip into your system. I'll also need to brush up on my SQL as I only have a very basic knowledge of it. Thanks.
edit: think of SQL as something to be encapsulated in another language.
I'm more of the opinion that while ORMs aren't bad, per se, you need to know enough about SQL such that when your ORM is flaking out on you, you can read the SQL its producing to make sure it's actually what you want and/or replace it with your own hand-written custom SQL as necessary.
http://flask.pocoo.org/
It's a Python micro-framework that lets you attach functions to URL patterns. Then when the web server gets hit at a matching URL, that function is executed and its return value is sent as the Http Response. It's a good way to learn web programming without going too low or too high a level. Once you get the idea, you can move onto a full framework like Django or whatever without hurting yourself. I tend to use Jinja templates with Flask. http://jinja.pocoo.org/docs/
But that block quote you made a comment about from my earlier post had some relevant info other than socket programming.
Instead what I do is use an AMQP message queue. Specifically I use rabbitmq and celery and django-celery. This allows me to put scheduled tasks as well as tasks with delayed execution right into my codebase directly. For example, when I upload an mp3 to frontrowcrew.com there is a celery job that sends it to Libsyn via FTP. Likewise, the automatic tweets of daily episodes are a scheduled celery job.
Another concern is what the website is used for. Generally I tell everyone to build with HTTPS in mind from the ground up now. If you're doing anything with credit cards it's required, but even if your just submitting usernames and passwords I would recommend getting a valid certificate and using it for everything that isn't a brochure site.
If you're doing anything with sessions or cookies (and sessions store a cookie so that's important too), you should read into what practices are used to protect yourself from somebody spoofing/duping/predicting your session IDs or cookie information. Even big name websites I've found have flaws in this system. I know that while I was signing up for my PAX hotel room last year, I actually was put into some other users session and could see their registration information because (from what I could figure) they assigned me a session ID that belonged to someone else while it was still active. This is one of those areas where you'll never be 100% secure, but you can make it reasonably unlikely that anyone will abuse it.
Regarding the back-end, you may want to consider isolating the ajax layer from the stuff you want to protect. On my current project, we're building something heavily ajax-reliant. In order to make it more secure, the ajax interface only ever talks to an api we've been building that exists on a sort of middle server. The middle server is firewalled off from everything but the webserver, and has only the functionality we allow it to interact with our back-end. Not something you'll need to do immediately, but this gives us some extra protection so that if someone with poor practices (or harmful intent) leaves a giant open security hole on the webserver, the damage is mitigated since it can only abuse functionality that we've decided would be acceptable for the website to interact with. The webserver has its own sql database for some actions, but even if someone gained total control over it they wouldn't have access (hopefully) to anything but what the api server allows.
Meh, and these are just my pet issues. Doesn't even scratch the surface really. XSS attacks are a concern, how to store/setup user passwords is a sort of issue, your specific security concerns concerning a specific websites requirements (PCI compliance, telecommunications laws, banking).
Start here:
http://www.w3schools.com/php/default.asp
and here:
http://www.tizag.com/phpT/
Reference:
http://php.net/manual/en/function.mysql-real-escape-string.php