Database Inference


Inference Overview

In the context of database security, inference is the act or process of deriving sensitive information from premises known or assumed to be true. Whereas, in a multilevel secure DBMS an inference attack occurs when a low level user is able to infer sensitive information through common knowledge and authorized query responses without directly accessing the DBMS. (Morgenstern, 1987)

A technique that facilitates database inference is data mining, a method used to discover patterns within sets of data. Within data mining there are a number of useful tool is such as “data association”, which is a user-defined grouping of seemingly unrelated groups and elements. (Raman, 2001) Association rules are valuable for analyzing consumer behavior, and consist of antecedent (if) and consequent (then) statements, as in “if” a customer buys “x”, “then” they will buy “y”.

Another data mining tool is “aggregation”, which occurs when information is gathered and expressed in a summary format for statistical analysis, such as examination of mean, median, standard deviation, and other parameters.

Methods of Attack

Out of Channel

This is a particularly difficult inference vulnerability to protect the DBMS against, as much of the data that is acquired is from external sources. In this type of attack extensive use is made of freely accessible information sources, and using that data to perform inference of a secured database.

Indirect Attacks

This type of attack is accomplished by the use of intermediate results gleaned from aggregate mean, median, standard deviation, the use of the Sum, Count functions, or set theory.

Direct Attacks

This type of attack is typically conducted against a DBMS with poor security, such as inadequate MAC and DAC configurations. In the direct attack, queries that will elicit small responses are launched at the DBMS.

Logical Inferences

The logical inference is often considered a type of direct attack, but may designated as an indirect attack, dependent upon its’ level of complexity. This type of attack makes use of association rules, and the data mining strategies of apriori algorithms and clustering.

Statistical Inferences

This indirect attack utilizes aggregate data and mathematical and statistical analysis to derive inferences on numerical data or textual data sets. The textual data can be enumerated or represented as frequencies or counts, and this same statistical method can then be used to derive associations. (Hylkema, 2009)

Query Results

We will use statistical inferencing to extrapolate the unknown salaries of Alice, Bob, and Dan. To accomplish this, we utilize the salaries culled from the Java applet “Database Inference” of the various groups that we know they are a member of, and calculate the mean. In the case of Alice who is in the “Clerk, Support, and 3rd floor groups”, we will use the following figures:

All Clerks Avg. $34, 5000

All Support Avg. $35, 500

All 3rd Floor Avg. $35, 000

Thus to determine Alice’s salary, we would utilize the following formula: 34,500+35,500+35,000=105,000/3. Therefore, we can infer that Alice’s salary is $35,000.

Using the same methodology, we can deduce Bob’s salary. Bob is a member of “Admin, Sales”, with no floor designated which equates to: 38,500 + 52,625 = 91,125 / 2 for a statistical inference of a $45,562.5 salary for Bob.

Based upon Dan’s group memberships “Supervisor, Sales, Basement”, our calculations   68,333 + 52,625 + 68,333 = 189,291 / 3 produce an inferred salary of $63,097 for Dan.

Mitigation Methods

Suppression and concealing

In suppression, some query results are withheld by rounding, presenting a random sample or range of results. Similarly in concealing, data may be approximated, combined, rounded, or returned in a range or random sampling of results.

Random Data Perturbation

Random data perturbation functions by the addition of random degrees of erroneous data in response to the query request.


Partitioning consists of segregating data based upon its’ degree of sensitivity. This technique while highly effective in enhancing the confidentiality of our data does have a downside in the redundancy and complexity, which it introduces to the DBMS administration.


This technique is utilized in multilevel DBMS to preclude inference. In it, data is classified based upon sensitivity ratings, and end-users are only able to access data that they have the requisite clearance for.

Query Controls

This inference prevention method is typically used to counter indirect attacks. The query control will process the incoming query, the resultant output, or perhaps both, and deny queries or results that do not conform to DBMS inference policies.

Preprocessing and Result Analysis

Query preprocessing occurs prior to query execution, and is used to prevent questionable queries. Conversely, query result analysis is performed after query execution, and is used to prevent dubious results from being too precise, particularly those that may have been missed by the preprocessing stage.

Query History Retention

Typically in query history retention, clustering algorithms are utilized to archive queries of users or groups to ensure that multiple queries are not being used perform inferences on classified data. Collecting information on groups can assist with the mitigation of collaborative inferencing, though it does require more system resources, and may generate false positives. (Hylkema, 2009)


Database Inference. (2012). Retrieved from

Hylkema, M. (2009). A Survey of Database Inference Attack Prevention Methods. Retrieved from

Jajodia, S. Meadows, C. Inference Problems in Multilevel Secure Database Management Systems. Retrieved from

Morgenstern, M., Denning, D., Akl, S., Heckman, M. (1987). Views for Multilevel Database Security. Retrieved from

 Raman, S. (2001). Detecting Inference Attacks Using Association Rules. Retrieved from

Rouse, M. (2011). Association Rules In Data Mining. Retrieved from



phpMyAdmin is a free and open source tool written in PHP intended to handle the administration of MySQL with the use of a web browser. It can perform various tasks such as creating, modifying or deleting databases, tables, fields or rows; executing SQL statements; or managing users and permissions.


Features provided by the program include:

  1. Web interface
  2. MySQL database management
  3. Import data from CSV and SQL
  4. Export data to various formats: CSV, SQL, XML, PDF (via the TCPDF library), ISO/IEC 26300 - OpenDocument Text and Spreadsheet, Word, Excel, LaTeX and others
  5. Administering multiple servers
  6. Creating PDF graphics of the database layout
  7. Creating complex queries using Query-by-Example (QBE)
  8. Searching globally in a database or a subset of it
  9. Transforming stored data into any format using a set of predefined functions, like displaying BLOB-data as image or download-link
  10. Live charts to monitor MySQL server: activity, connections, processes, CPU/Memory usage, etc.




SQL Injection

 database security

SQL Injection


In 2003, an analysis of the buffer overflow exploit caused it to be pronounced the vulnerability of the decade (Cowan et al., 2003).  Since that time, buffer overflow exploits have ranked in the top ten exploits of the Open Web Application Security Project (OWASP), National Vulnerability Database, and Common Weakness Enumeration / SANS list of Top 25 Most Dangerous Software Errors.

SQL injection is a code injection technique that can be used against MS SQL, MySQL, as well as other DBMSs. A buffer is a region of memory that temporarily holds data.  In a buffer overflow attack, a malicious program injects too much data into the buffer. This can cause errors, program crashes, and security breaches.

 Buffer overflow occurs when a program or process tries to store more data in a buffer than it was intended to hold. The program execution will cause the application to write beyond the finite constraints of a pre-allocated size buffer. The overflowing data will overwrite adjacent memory locations and may corrupt the valid data held in them, or execute new instructions on the affected computer that could, damage user files, change data, or disclose confidential information.

Mitigation. Due to the fact that buffer overflow exploit a vulnerability occurring in at the database layer of an application, a simple mitigation is to not allow unauthorized user-input to be  directly embedded into SQL statements.

Another mitigation strategy is to use a programming language that performs its own memory management, such as Java and Perl, or an environment like .NET which may diminish the impact of buffer overflows. Additionally, Cyclone C may be used to negate buffer overflows, and other related exploits. Other languages such as, C#, and Ada, which supports run-time checks to protect against access to unallocated memory, buffer overflow errors, range violations, and other bugs may also be used. Both programs will allow the checking functionality to be disabled by the programmer if need be, to enhance performance.

We should assume all user-input is malicious, thus use a whitelist of acceptable inputs that adheres to specifications. The use of blacklists or looking solely for malformed input cannot be depended upon, though blacklists may be used as a yardstick for attack detection.

It is also advised that we run our DBMS in a sandbox, and that we set boundaries between processes and the operating system. Additionally, we should run our DBMS with the lowest level of privileges necessary to perform our functions.

We may also consider using Data Execution Prevention (DEP)/ NX memory protection. This is common security feature included in, Windows, Linux, and Mac operating systems. Its function is to prevent services and applications from implementing code in a non-executable memory region, thus preventing exploits that store code via a buffer overflow.


Cowan, C. (2003). “Buffer Overflows: Attacks and Defenses for the Vulnerability of the Decade.

OWASP Top 10. (2010) The Top 10 Most Critical Web Application Security Risks.