Full SQL and SQL monitoring, deeper view on Access domain
For better understanding SQL entity we need to describe a little bit deeper the logging actions in Guardium policy.
My audit policy (selective audit trail) contains two rules.
It will log activity of syntaxuser1 using LOG action (LOG ONLY) and other traffic will be audited with details – LOG FULL DETAILS action.
I connected to postgreSQL database two times as a test and syntaxuser1
and these sessions are visible (right) but report based on Full SQL entity does not contain syntaxuser1 activity (left).
The reason is simple and understanding of this is very important to create accurate database monitoring policy and reports. The LOG ONLY action logs SQL constructs and does not audit full SQL body executed inside session.
So what does exactly LOG ONLY log?!
The LOG ONLY (it is also default action for non selective audit trail policies!) removes SQL parameter values from SQL body. For instance 3 SQL’s:
SELECT * FROM table WHERE columnX='value1' SELECT * FROM table WHERE columnX='value2' SELECT * FROM table WHERE columnX='value3'
are described as a one SQL construct
SELECT * FROM table WHERE columnX='?'
so audited activity based on LOG ONLY action allows identify syntax but it is not possible to present full body (if SQL contains parameters).
The main purpose of LOG ONLY action use is the meaningful decrease of disk space consumption by audited traffic because we do not need store each SQL and put only reference (Construct ID) to known by collector SQL constructs stored in SQL entity.
It is a good time to introduce very important entity – Access Period. Independently to FULL SQL flow (described in part 1) Guardium stores audited activity inside the hourly based sets named periods – we can visualize them as data partitions. Now we can discuss sense of this kind approach but it was historic decision (more that 10 years ago) based among others on cost of storage and CPU utilization.
Periods describe the all audited traffic on hour basis and simplify data partitioning and point executed in this timeframe SQL’s by Instance ID and Construct ID keys . So we can present data inside entities this way
Data flow in 5 main entities:
- policy makes decision to log activity (LOG ONLY or LOG FULL DETAILS)
- if SQL is related to new session – new Session ID is registered and Access ID is attached to it or new connection profile is registered
- system checks – is Session ID registered in the current period (current hour)?
- False – new Instance ID is created in Access Period entity
- if LOG ONLY action is used
- the SQL is “anonymized” – parameter values are replaced by question mark
- system checks existence of SQL construct in SQL Entity
- False – new Construct ID is registered in SQL Entity
- new record (access) is attached to current period (partition) in the Access Period entity – Access ID, relations to SQL (Construct ID) and Session (Session ID)
- if LOG FULL DETAILS action is used
- the SQL is registered in FULL SQL with reference to Session ID
- SQL is “anonymized” and registered in Access Period (Instance ID) this same way like described for LOG ONLY action
- Record in FULL SQL stores reference to Instance ID in Access Period
We should be aware some limitations if LOG ONLY action is used to audit session:
- data are stored without parameter values
- we cannot identify exact time of SQL execution, we can estimate time by reference to:
- Session timestamps – between Session Start and Session End
- Access Period – SQL execution inside partition, between Period Start and Period End
- Access Period Timestamp – last execution of particular SQL construct inside period instance
- SQL’s from one session can be located in many periods (partitions) if session involves many hours
- Both logging actions can be used inside policy to audit activity from this same session (it is powerful) but it can lead to incorrect conclusions if we base on FULL SQL report only
- The Period Start is used as a timestamp for Access Period Entity (the Timestamp field has not important value)
It should be also stressed that LOG FULL DETAILS action stores SQL in both entities Full SQL and SQL
So, from theory to practice 🙂
Example 1 – No space on disk, no data in reports
It happens when we use LOG ONLY action in our policies and we try to review activity in the report based on FULL SQL entity
SQL counter identifies thousands constructs (report based on SQL entity) but SQL syntax report is empty (based on FULL SQL) – my audit policy uses LOG ONLY action. If you log events using LOG ONLY action somewhere you should not report data based on FULL SQL to report them.
Example 2 – Where is timestamp for SQL entity?
I created a query based on main entity – SQL
and you should notice that there is no timestamp inside the available in SQL entity fields. What does it mean? Can we create report base on it and use time period specification for results?
Yes, we can, because query gathers timestamp from the closest (direct) relation if it does not exists in the main entity.
The direct relation for SQL entity is Access Period which use Period Start field as timestamp. It has a big influence on result 🙂
I connected to database two times and executed simple query in each session
then I tried to display my activity related to second session only (base on Start Date after 19:18:00)
and both sessions are visible. The explanation is simple – timestamp for this report bases on Period Start field what for the input value 19:18 defines the period 19:00-20:00 (7-8 am) where two Test user sessions happened.
There is no possibility to granular time different way (more in Example 5) because it is main entity dependent. How to display SQL’s belonging only to the second session? – we can use Session ID for example as a filter
I put Session ID of interesting me session in the added filter and “voila”
Example 3 – I do not see part of my SQL’s – situation 1
This time I executed 6 SQL’s but only two of them are displayed in my report based on SQL entity.
What I pointed before, the SQL entity stores constructs instead of FULL SQL body so SELECT 1, SELECT 2 and SELECT 3 are visible here as SELECT ? and 3 executions of SELECT now() also point only one SQL construct. Please notice also that Timestamp in Access Period entity points the time closest to the last execution of particular SQL construct.
Does it mean that we cannot identify the exact number of executed SQL’s if LOG ONLY action is used? – of course, we can 🙂 The number of occurrences of constructs are stored in Access Period and we can refer to it from SQL entity using entity counter (Add Count)
Now we have full information that my session contained two SQL constructs and each of them was executed three times.
The SQL entity does not store execution timestamp so order of executed constructs is unknown.
Example 4 – I do not see part of my SQL’s – situation 2
This time I executed 8 SQL’s inside session
and only four appear in the report. You should notice that my session lifetime covers 2 periods (08:00-09:00 and 09:00-10:00) but report time range refers only to the second one. In Session Start column we have information when session started and my period reference has to point to it if I would like to receive full session statistics
Example 5 – One hour granularity is not foxy
Guardium allows decrease the default access period time granularity from one hour to 1 minute even.
Do not forget Apply changes and Restart IE’s before. Here the Logging Granularity has been set to 10 minutes what is visible in the report below
Example 6 – SQL or Access Period as main entity?
The SQL entity as a main entity is used only when information about construct body is needed. If we focus on quantitative analysis and interesting in the user behave the Access Period domain is much more efficient. Access Period is also the timestamp reference entity for very useful entities like Command and Object (I will focus on them in the next article about Guardium reporting)
Example 7 – How to see all SQL’s
It is common situation that Guardium policy mixes LOG ONLY and LOG FULL DETAILS actions inside rules. So only part of activity can be reported using FULL SQL entity. We do not need create complicated reports to summarize and analyze this diversified type of auditing because each fully monitored SQL is also stored inside SQL entity.
It should be now clear that FULL SQL refers to Access Period using Instance ID key and indirectly we can identify executed Construct ID.
Please remember that any audited activity in Guardium is always visible inside Access Period and SQL entities.
Any quantitative analysis should relies on them especially when not only LOG FULL DETAILS action is used inside policy rules.
The report data extraction works much more efficient if we use Access Period instead of heavy queries on FULL SQL entity.