Archive for November, 2004
CSO Magazine Analyst Reports
Tuesday, November 30th, 2004A couple of interesting and relevant articles from CSO Magazine.
- Trends 2005: Risk And Compliance Management by Michael Rasmussen.
- Clearing Up the Muddled Security Management Market by Andrew Braunberg
Report Quality NOT Quantity
Monday, November 29th, 2004If you look at any of the SEM/SIM products these days, they all tout how many pre-built reports they have prepared for you. Most of them have a hundred or more, some even have a couple hundred!!
How are you ever going to have time to go through that many reports and find out if they are useful?!
If you look at the report names, they generally go like
That should be ONE report, not THREE!!! It’s a report that shows you the connections aggregated by a certain column!
Vendors should be building more flexible reports that allow users to configure the output. For example, in the example above, the vendor can simple provide one report that has a configurable parameter on which column to aggregate (group by in SQL terms). That way, the user can configure the reports however they like it and then save all the parameters (aggregates, filters, sort order) into a custom report.
Be wary of vendors touting the quantity of reports as a competitive advantage. The two hundred or so reports may really only be 50 or so.
Log Management Requirements for MSPs
Sunday, November 28th, 2004I spent five years at one of the largest MSSPs as an architect and development manager. We had a couple thousand firewall, VPN, NIDS and HIDS devices that we manage for various hosting and managed service customers. We needed to aggregate all the logs generated by these devices and be able to provide reports and analysis for our customers.
We spent quite a bit of time looking at various COTS products and services, including ArcSight, Intellitactics, netForensics, and others. However, none met our requirements.
At the end, we built our own solution using open source tools such as MySQL, GD::Graph, RRD Tool, Cricket, etc.
Below are the top ten requirements that any MSP should consider when building their log management solution.
1. Segregation of Logs by Customer
As an MSSP, one of the biggest concern we had was the segregation of logs for the many customers we had. We didn’t want any of the customer data to be mixed in the same files or database tables as other customers. This requirement drove many of the design decisions we made during the building of the support infrastructure.
Imagine two competitors had their firewalls managed by the MSP and their data were mixed in the same files, what happens if the data of one competitor were shown to the other because of a bug in the software that’s used to filter logs?
2. Raw Log Retention Is Required
One of the requirements we had was that we needed to provide the raw logs to our customers so that they can do their own analysis. We can do all the necessary analysis on our side, but sometimes the customers may want other information from the logs that we don’t. Also, customers sometime have information that are proprietary and they can use that information to correlate with the logs.
Remember, MSPs built everything based on scale of economy and sometimes it’s difficult to customize the solution for individual customers. Obviously if the customer wanted to pay for professional services, there’s customization we can do. But not every customer wants to or have the budget to do that.
3. Reporting is #1
Customers wanted to reports on their security infrastructure. They wanted to see how much traffic (bytes, connections, etc) the firewalls are passing so they can properly plan for the future. They wanted to see how often users are VPN’ing into the infrastructure so they can identify any mis-use. They want to see trend reports to show how their infrastructure is holding up. They wanted to see correlated reports of the various devices (firewalls, VPN, IDS, etc) to see if there are any anomalies.
Our #1 priority was to provide all these reports and more to our customers.
4. Then Comes Real-Time Analysis/Alerting
Alerting is another important feature that our customers wanted. They wanted to know when something really bad is happening to their infrastructure. They don’t want to get alerted every time some script kiddie scans their network, but they do want to know if their network all the sudden has a huge increase in traffic and continues to increase for a long period of time. They want to know when an unauthorized connection is made to their production database. They want to know when a successful attack has been happened.
However, the MSPs are the first to receive these alerts and will filter the alerts based on the internal knowledge and SLA. Then the MSP will pass the alerts onto the customer if they are determined to be real.
5. Support All My Log Sources
Most MSPs support an array of devices and applications as part of their service. For examples, most MSPs will support firewalls such as PIX, Netscreen, Check Point or IP Tables, Network IDSes such as ISS or Cisco, Host-based IDS such as ISS, Okena, or Trip Wire. Some MSPs will support SSL accelerators or proxies, vulnerability scanning tools, servers, and many other applications.
At Cable & Wireless, we had firewalls, Network or Host based IDSes, scanning tools and authentication services. We needed a solution that will support all these different devices or applications. Most of the logs are sent via syslog except for Check Point firewall, Cisco IDS and Nessus scanning results. The Check Point logs were aggregated at the Provider-1 boxes and they need to be off loaded using LEA. The Cisco IDS alerts need to be retrieved via RDEP. The Nessus scan results were XML files and they needed to be parsed.
6. Web-based GUI Only
We needed a web-based GUI so that the customers can access reports, alerts, raw logs and device policies. It’s difficult to support any GUI that requires installation on the customer’s desktop. Non-web applications will eventually have conflicts with other applications and will require support from the MSP. That model is not scalable.
We also needed to embed this interface in our own web based interface. It is much easier to embed another web interface than an application. It is also much easier to skin a web interface as we wanted to have our corporate color scheme and logos there.
7. Distributed Collection Points
Cable & Wireless had over 40 data centers all over the world. Most MSPs probably have distributed environments as well. The devices that we managed were spread all over the data centers. They can be in UK or NY or SF or even HK or Japan. We needed a solution that can collect logs in all these different locations. We needed to be able to collect logs close to the log source so that the chance of dropped logs is minimized. We also wanted to keep the cost of these remote collectors relatively low comparing to the central archive.
8. Secure Connections All The Way
As a security organization, everything we did that can be secured must be secured. Unfortunately we cannot secure protocols such as syslog from a PIX, but we must support SSL for the web interface, Cisco RDEP, Check Point LEA, and encryption between remote collectors and the central storage. We also needed to secure the connection between the database and other components if they are in different networks.
9. No Agents Whatsoever
Due to the risk (performance, application conflicts, etc) of installing additional software (custom agents) on servers, we needed a solution that doesn’t require agents. This requirement is generally not a problem for most devices since they send logs via syslog. However, it’s a big issue for Windows based operating systems. There are several methods that one can collect Windows events, as I have written previously. However, all these methods have their problems such as requirements of agents or performance issues or none-real-time logging. Since we didn’t manage many Windows devices, this was not our major concern.
10. Open API for Integration
Another requirement was that we needed to run reports or search logs via scripts or web applications. We ran many background processes that sent out reports to customers or create custom reports based on some other condition. All these needed to be automated due to the need for scale of economy. This required that the solution to provide some type of open API that can be used within other programs.
11. Granular Permission Model
Since we needed to provide alerts, reports and raw logs to customers, we need to make sure customers have access to ONLY their own information. It was critical that customers don’t see other customers’ data. We needed the solution to have a granular permission model that can determine which devices belong to which customer, which reports the customers/users can see, which log files can be viewed by customers, etc.
—
These are some of the requirements for an MSP. I hope this will help you evaluate and build your own log management infrastructure.
SGUIL - The Analyst Console for Network Security Monitoring
Wednesday, November 24th, 2004InformIT has a detailed article on Sguil, Why Sguil Is the Best Option for Network Security Monitoring Data. According to the website,
Sguil (pronounced sgweel) is built by network security analysts for network security analysts. Sguil’s main component is an intuitive GUI that provides realtime events from snort/barnyard. It also includes other components which facilitate the practice of Network Security Monitoring and event driven analysis of IDS alerts.
Comments Broken!! Now Fixed!!
Wednesday, November 24th, 2004Darn, w/ all the tweaking I’ve been doing to combat the spammers, I actually broke my comments!! No one has been able to post any comments.
In any case, it’s fixed now and hopfully there won’t be any more issues.
Pros and Cons of MSSPs
Tuesday, November 23rd, 2004We will be a bit off topic today as I am thinking about a few-parts blog on MSSPs. Today we will discuss the pros and cons of outsourcing to a MSSP. Other ideas I have in the pipe for the next few days are:
- Requirements for Choosing a MSSP
- Log Management Requirements for a MSSP
If you have some ideas of what you would like to see, please let me know.
There are many reasons why outsourcing sometimes is a cheaper and better way to go. Note that I said “sometimes”, because everything depends on your requirement. If your requirement is that every security device must be in house and only 2 Admin will have access to them, then outsourcing is not for you. So first thing you need to do is document your requirements.
So here are some reasons why I think outsourcing is an option.
- Cost - MSSPs can get much better deals from vendors than you can on your own. So the cost of hardware and software will be cheaper. Let’s do some simple calculations, if you decide to firewalls inhouse, the cost of a pair of PIX 525 retail + maintenance is about $20k. The cost of a dedicated security engineer + training will cost you atleast $110k (low figure as I have not added corporate overhead, which could be another 30-40%). Take that over 3 years (that’s usually how long the companies will depreciate equipment.) That gives you about $10k/month. You can get it for much cheaper with an MSSP. Generally you can get a decent SLA for $1-2K/month. Over three years, that’s quite a big of savings!
- Hardware Upgrades - This section maybe different for different MSSPs, so be sure to ask if you are looking to outsource. Basically, hardware gets obsolete very quickly. If you buy your own hardware, in 3 years, you will have to spend money upgrading. The original investment you made is now paper weight. But if you go with an MSSP, you can get the hardware upgrade for free. For example, let’s say Nokia decides to upgrade their IP350 platform from the current processor to a faster one, the MSSP will be able to upgrade you for free where as you would have to spend money on your own.
- Software Upgrades - Same as hardware here. You can get software upgrades for free with a MSSP where as you might have to pay your own way. For example, from Check Point 4.1 to Check Point NG AI.
- Vendor Support - Because MSSPs buy so many equipment/software from vendors, they have much better support from them also. They usually have dedicated support from these vendors 24×7. So any problem that arises will get to the right people immediately, instead of having to go through the normal channels. MSSPs can also get patches/fixes/updates much faster as well. If needed, sometimes vendors are willing to cut an engineering release to fix a HOT problem. Now not all MSSPs have the same support contract with vendors, so buyers beware.
- 24×7 Support - We are not talking about somebody carrying a pager here, we are talking about having trained security engineers awake and doing work any hour of the day. This is one of the biggest advantages for outsourcing. Scale of economy plays a huge role here. The MSSPs can have dedicated engineers working 24×7 whereas you might have your guys waking up in the middle of the night, all grumpy, to fix some problems.
- Expertise/Experience - Because the MSSPs work with firewalls/VPNs/IDS all the time, it is much more likely that they will have encountered the problem that you are experiencing. In this situations, the MSSP may be able to fix you problem in 30 mins, whereas you may have to spend hours figuring out what happened and try to fix it.
- Software Patches - This is perhaps one of the biggest issues with security nowadays. Many organizations simply don’t have the resource or time to keep up with all the security patches or updates on their security devices. The MSSPs will HAVE to do that as part of their SLA. Again, this is where scale of economy plays a big part in. The MSSPs can upgrade all of their security devices such as firewall or VPN with the appropriate patches when they receive it from the vendors (usually sooner because of their relationships).
- Training - Most of the MSSPs require their engineers to be trained on the devices they service, and they are willing to spend the money to get them trained. Training is certainly not cheap, a PIX or Firewall-1 course can cost anywhere from $3k - $5k. Many of the engineers are also experienced in designing complex & secure networks. I for one am not very fond of certifications (even though I carry a couple). I think anyone with half of a brain can pass the certification exams, for example. So when/if you are looking for an outsourcer, beware of anyone telling you that all their engineers are certified. It really doesn’t mean jack. Certifications provide some value, but not a whole lot. It is the hands-on training and experience that count the most.
- Spare Equipment - This again is another huge value MSSPs can provide at very little or no cost to you. Because MSSPs manage so many equipment, they cannot wait for vendors to ship them spare equipment when something dies, so they have extra equipment ready to deploy. And trust me, equipment do die.
- Security Monitoring - One of the hottest topic in the security space is obviously log analysis and management. Many vendors have some type of event correlation engine or tool they are using to help you monitor your network. For example, NetSec uses the neuSecure product of independent software maker, Guarded.Net; Symantec acquired the correlation engine of Cyberwolf and RipTech; Savvis, Ubizen and others have their own home-grown solutions. Again, depending on the MSSP, you may have to pay extra for this service or you may get it as part of your SLA.
However, there are some disadvantages to outsourcing as well.
- Control. You lose some or all control of the device itself, you still have control of the policy however. If you can’t swallow that, look for a MSSP that will share access with you.
- Corporate-specific Knowledge. The MSSP will not know everything about your organization as you would, so you have the responsibility to work with the MSSP to make sure that they understand what you need.
- Security Requirements. Your security requirement may be more strict than that of the MSSPs. For example, your requirement says only 2 people have access to the firewall, but the MSSP may have more engineers working on it.
So definitely find out all the different requirements you have, see if the MSSP can meet them. Make sure you ask all the questions you have, and don’t let the MSSP bs you into something that you are not sure about. In other words, do your research first.
AT&T Getting Into the SEM Game
Monday, November 22nd, 2004So it looks like AT&T wants to get into the SEM game as well, according to this eWeek article.
AT&T is also working on a security event management system called Aurora that it plans to sell as a software solution. The system relies on the company’s Daytona database and is designed to do more than simple event correlation and normalization. Aurora includes a console that gives administrators access to real-time alerts about attacks and vulnerability advisories, as well as live case management and descriptions of the methods and procedures AT&T analysts are using to handle the event.
It seems like everyone in the SEM space always claim that they “do more than simple event correlation and normalization.” The problem is most SEM products don’t even do the “simple event correlation and normalization” well. Instead of adding all kinds of fluff, maybe they should concentrate in doing one thing well at a time.
A Firewall Log Analysis Primer
Sunday, November 21st, 2004Found this while googling.
A Firewall Log Analysis Primer From LURHQ. (pdf version)
It’s fairly basic but a good start nonetheless.
Troubleshooting Behavior
Sunday, November 21st, 2004I am interested in finding out how you go about troubleshooting network (e.g. downtime), security (e.g. virus infection) or application problems using logs. What logs do you look at? What system tools do you use? What do you look for in the logs?
I know this is a pretty general question and it’s probably difficult to write the process down.
I would be interested in any thoughts or comments you have. Thanks in advance.
Forensic Log Parsing with Microsoft’s LogParser
Saturday, November 20th, 2004A nice and detailed article by Mark Burnett on Microsoft’s LogParser. According to Microsoft:
Log Parser 2.0 is a powerful, versatile tool that you can use to extract information from files of almost any format by using Structured Query Language (SQL)-like queries.
More information on this tool can be found on Microsoft’s site.
Security Management Systems
Friday, November 19th, 2004Found this paper by Dan Keldsen on Security Management Systems (or SIM/SEM). A bit dated but worth reading.
Five Factors to Consider When Building Your Logging Infrastructure
Friday, November 19th, 2004Whether you are building your own home-grown logging infrastructure (which of course I do not recommend ;)) or evaluating a log management solution, there are at least five factors you should consider.
1. Log Retention
The log retention period obviously depends on your requirements. If you are building out the infrastructure for troubleshooting and short term reporting, you may only need to keep 1-2 months of logs. But if you are doing it so you can be in compliance with SOX or HIPAA regulations, you will need to keep AT LEAST 6 months for the auditors.
As a rule of thumb, if your requirement is regulatory compliance, make the retention period 12 months to be safe. If you can afford it or the product can support it, go even longer.
Obviously how long your retention period is also depend on the volume of logs you receive as well as the product/tool’s ability to manage the log storage. If you are building your own, be sure to take into consideration of building a log rotation process. For example, if your retention period is 12 months. Your process should remove the old logs or put them on tape. If you are evaluating a product, be sure the product has the capability of rotating/purging old logs for you. Don’t spend $200K and then have to write your own scripts.
2. Log Volume
Log volume is probably one of the most critical factors in building your infrastructure. It has direct impacts on your retention policy, report/search performance, aggregation performance and correlation performance.
Vendors talk about log volumes in many ways but it all comes down to the number of log message per second you receive. With that number in hand, you can calculate how much storage space you will need. For example, if your log message rate is 2000/second, assuming 200 bytes per message (which is fairly normal), we have
2000 * 3600 * 24 = 172.8 million messages / day
172.8M * 200 bytes =~ 33GB / day * 30 days = ~100GB / month
That’s quite a bit of data. This exercise brings up a few things you should be aware of.
First, you need a product or develop a solution that can handle the message rate that your environment generates. In this example, get something that can handle at least 3000 messages per second: 2000 for your requirement, another 30% for growth and possible spikes. If you are evaluating a product, test it to make sure it doesn’t drop any of your logs due to performance issues of the software/appliance.
Second, you need something that will compress the log archives. With 100GB/month, your storage requirement will go through the roof!! Even gzip will give you atleast 10:1 compression on the logs.
Third, note how I used a 200 byte per message? Well, if you parse it and put it in a database, the storage requirement per message will increase. For example, ArcSight uses a 2KB/message for their calculations. That basically takes the 1 month retention storage requirement to over 1TB! Ask your vendors or do your own calculation on what the REAL storage requirement is. Make sure the product you are looking at has enough storage space for your retention policy.
Last but not least, your log volume really impacts the performance of the product you choose. Some of the products, as the volume grow in the database, will have problem running reports for a long period of time. If you need to run a report for over a month or two, sometimes it may take hours for a single report. It’s difficult to test this during a evaluation period, but many of the implementations fail because of this.
3. Log Sources
What are all the devices, servers and applications that will be logging? If you are developing your own solution, there may be a lot of work for you to do in order to parse the various log messages. The good thing is you will only need to parse the specific logs you need and not everything.
There are many different logging methods (file, database, syslog, proprietary) and formats (single-line, multi-line, XML, database records).
Most vendors will show you a list of all the logs their product will support. Some vendors will support 100’s of log sources across many different categories, such as firewalls, routers, switches, IDSes, web servers, mail servers, access control software, operating systems, etc etc etc.
Make sure the product you are looking at supports all your log sources. If not, make sure that there’s a way for you to develop new parsers for it. Most of the time it will just be some regular expression for parsing logs.
Make sure the product will support some of the native logging methods and formats. For example, Check Point logs can be retrieved via the LEA protocol and Cisco IDS via RDEP. Windows event logs are just a pain in the butt if your central log retriever is a non-windows platform.
Some products will accept ANY log even if it doesn’t parse them. That will allow you to archive the logs and do some rudimentary search and alerts on them, but not do detailed reports.
However, some products will hardcode the parsers in their code and no way for you to create any new parsing intelligence. Beware of what you are getting into if that’s the product you are looking at.
4. Log Analysis
Ok, so this is a huge area. Log analysis includes everything from reports, correlation, anomaly detection, and trend analysis. It again depends on your requirements. However, your solution or product should have some of the basic functions such as threshold and rule-based alerts via email or SNMP.
Most vendors will provide pre-defined reports that covers the Top N reports across most of the log sources they support. The pre-defined reports are basically the intelligence of the products. Without them, the product is basically useless and you will need to spend a lot of time configuring it instead of using it.
Log analysis can cover many different areas, including security incidents detection, virus infected machines discovery, device/application up/down, usage analysis and capacity planning. Most of the SIM products basically focus on just security incidents detection. If your requirement is not just security, make sure the products can handle it.
Report and correlation performance is a critical factor. If reports takes hours, it’d be somewhat useless when you need a quick ad-hoc report to figure out which IPs are DOSing you. Build your infrastructure w/ at least 30% more performance than you need. That way you have some room to grow and also allow you to do quick reports when you are receiving a spike of logs.
5. Network Topology
Your network topology impacts how you should architect your logging infrastructure. If you have a fairly distributed topology, e.g. many remote locations, you will want to design a solution or look for a product that have a distributed architecture, that can retrieve/receive logs in a distributed manner and forward logs back to a central location for analysis and archival.
If you just have a single central location (I can’t imagine anyone having that kind of infrastructure these days), you can probably get away with a product that can’t be architected in a distributed manner.
Ok, so here comes the downside of distributed architecture. Price! Any additional component you add will cost you. Be sure to check w/ the vendors to see how much it will REALLY cost. Also, some of the smaller devices vendors provide for remote locations can handle a lower message rate, make sure the ones you choose for each location can meet the requirement and have some room to grow.
—
I hope this helps you in understanding what’s needed to build your logging infrastructure. Please let me know if you have any questions or comments.
P.S. I would love to see a review of log management products w/ these 5 factors in mind and actually score the products for each factor.
SIMplicity (SIM bake off)
Thursday, November 18th, 2004Fresh off the press.
Information Security Magazine has an interesting article on “Security information management tools refine the deluge of raw data into actionable intelligence“.
I will write more about it later, but thought you might be interested in reading it first.
I would love to hear from you on what you think of the review and whether the information is accurate.
Data Life-cycle Management
Wednesday, November 17th, 2004Interesting article from ComputerWorld on Data Life-cycle Management.
Not totally log related but it has many of the same characteristics and requirements of log management. Namely
- Data protection
- Data retention and compliance
- Data resource management
To eval or not to eval
Monday, November 15th, 2004One of the biggest mistakes I have seen many organizations make is that they don’t evaluate the product they are buying. The organizations spent time creating a RFP, spent time reviewing the RFP responses, spent time talking to the vendors, even spent time doing due diligence on the vendors, but they don’t spend the time playing with the product before actually writing the check. They trusted the vendors’ white papers, the RFP responses and the marketing.
As I mentioned previously, the lack of clear understanding of the products is one of the five business mistakes you can make.
Without using a product, you will not understand whether it actually meet your requirements or not. If your requirement is root-cause analysis or forensics investigation, does the product you are choosing support that? Does it support drill downs for you to do investigation? Does it support ad-hoc reports for you to easily find the information you need? Does it keep the raw logs so you can actually see the data?
If your requirement is management reporting, does the product support PDF export? Does it support emailing? Does it have nice charts and graphs? (Believe me, management doesn’t care to see the raw data.) What kind of reports does it have? Does it let you create new custom reports and email those or just the canned reports?
If your requirement is to support 1000 logs per second, does the product support that type of performance? Does it have room for you to grow? Did you test it to see that it didn’t drop any logs? (Trust me on this one too, it happens! Since most logs are sent via syslog, they will be dropped silently!) What’s the peak and sustain performance of the software/appliance?
Does the product meet the security requirements of your corporate standards? Does it support your corporate authentication and authorization mechanisms? Does it support role-based authorization? Is the software installed on your server (or the appliance the vendor provides) locked down?
Vendors answer your RFP using the most general language possible and they always try to highlight certain features and hide others. They may interpret certain terms very broadly when you actually mean something very specific.
Be sure to understand your requirements and test the vendors of their claims. Never take the vendors’ word that their products meet your requirements. Put your hands on the product, use it, see how easy it is to perform your most important tasks.
There are generally two ways you can evaluate a product. Some vendors will install the software or appliance in your network so you can play with it. Some vendors will host a evaluation for you on their own network. Either way is fine, as long as you are in the driver’s seat and not the vendors’ SE.