Flow Search
Table of contents
Searching
Each row on the Flow page represents a group of flows that share common attributes; in short, a search. The search used is written in a lucene-like search syntax. This is a powerful way to search data, and more intuitive than SQL. The search syntax is fairly straightforward. Flow data appears in a javascript object notation (JSON) format. Clicking on the code icons (</>) displays the full JSON record of the flow.
When clicking on the code icon (</>), the row opens up and JSON can be seen. This particular JSON in its entirety looks like:
{
"@message": "Id = {66FCA14B-764C-40AC-93BF-E125DE1F71B9}; ClientMachine = FLUENCY-WINSRV; User = NT AUTHORITY\\SYSTEM; ClientProcessId = 1648; Component = Unknown; Operation = Start IWbemServices::ExecQuery - ROOT\\CIMV2\\Security\\MicrosoftVolumeEncryption : SELECT * FROM Win32_EncryptableVolume WHERE DeviceID='\\\\\\\\?\\\\Volume{97dda6ad-aee9-11e9-80b8-f01fafe4c694}\\\\'; ResultCode = 0x80041032; PossibleCause = Unknown",
"@level": "notice",
"@timestamp": 1602183532000,
"@customer": "tls",
"@source": "192.168.1.1",
"@tags": [
"ERROR"
],
"@type": "event",
"@facility": "tls",
"@sender": "192.168.1.1",
"@fields": {
"EventTime": "2020-10-08 14:58:51",
"Hostname": "FLUENCY-WINSRV",
"Keywords": 4611686018427388000,
"EventType": "ERROR",
"SeverityValue": 4,
"Severity": "ERROR",
"EventID": 5858,
"SourceName": "Microsoft-Windows-WMI-Activity",
"ProviderGuid": "{1418EF04-B0B4-4623-BF7E-D74AB47BBDAA}",
"Version": 0,
"Task": 0,
"OpcodeValue": 0,
"RecordNumber": 3146990,
"ActivityID": "{115A71ED-3A58-4723-B0E5-576C8400C90F}",
"ProcessID": 916,
"ThreadID": 57152,
"Channel": "Microsoft-Windows-WMI-Activity/Operational",
"Domain": "NT AUTHORITY",
"AccountName": "SYSTEM",
"UserID": "S-1-5-18",
"AccountType": "User",
"Opcode": "Info",
"EventReceivedTime": "2020-10-08 14:58:52",
"SourceModuleName": "eventlog_in",
"SourceModuleType": "im_msvistalog"
},
"@eventType": "nxlogAD"
}
We can see how much data really is in a simple flow. Fluency keeps a record of data in two formats: flow and event. The difference between these is that events are the uncorrelated direct logs, while the flow records are records of flows with all related events contained in them.
The basics of a flow are called the Tuple. A TCP/IP tuple is composed of a source and destination’s addresses, ports and protocol used, such as TCP or UDP. Every tuple is unique at any given time. Fluency uses the characteristic of tuples to fuse the data into a single document, document being the big data word for record.
There is no communication data in a Fluency flow record unless provided by a reporting system. All the data being displayed, such as how long the communication and the HTTP negotiation, is called metadata. This type of data is used most commonly for behavioral analysis and analytics.
Field-Value Pairings
The most common search is a Field-Value Pairing search. This means we are looking to match a field with a particular value.
In the example above, let’s say we want to find all the flow records going to IP address 10.0.55.91. This can be done by writing a search string where the field and the value we are looking for are separated by a colon:
dip:10.0.55.91
There is some serious magic in this statement. If you know netflow, then you know that dip stands for destination IP address. Because we are on the global->flow page, we know that all the netflow tuple data fields are going to appear at the start of the record:
sip: source ip address
dip: destination ip address
sp: source port
dp: destination port
txB: transmitted total bytes
rxB: received total bytes
txP: transmitted numbers of packets
rxP: received number of packets
rf: a join value of received flags
tf: a join value of transmitted flags
start_ms: the starting time of the tuple session in UNIX millisecond time
prot: the protocol number
totalB: the total number of bytes sent and received
dur: length in milliseconds that the tuple session lasted
Fluency only relies on the tuple information for correlation, but all these fields are searchable. So, where is the magic if it’s so straightforward?
The magic is that outside of the tuple fields for the global->flow, none of the other fields are defined, never mind required. Fluency is a schemeless database. This means that when data is parsed, the field names are defined completely by the parser.
This allows Fluency to be forward compatible with new data content that may appear in logs. Only Fluency and Elastic have this capability. It allows for searches to be available as soon as the new data is parsed.
Referring to Nested Fields
There are two reasons why having the code button (<>) is important. The first is that it let’s you see all the fields that are available. The other is to see the structure of that field. Fluency stores its records in JSON format. JSON allows for other JSON objects to be part of its data. For example, the http JSON object in the example above contains more fields. When one object is inside the other, it’s referred to as nested.
In order for users to use field-value pairing to find the flows going to terplab.cloud.fluencysecurity.com, we need to list every parent field. For this example, it means:
http.host: terplab.cloud.fluencysecurity.com
Using the Facet
Facet is a technical term used for groups of attributes; you can think of them as filters. The Facet Section covers this topic in more detail. To the left of most pages that return a number of records will be a facet.
The facet is editable. We can add new fields to group by and count. There are three states for each element shown:
- empty: does not impact the search
- checked box: include data that has this value for this field.
- crossed out box: ignore data that has this value for this field.
If we click on the example sources of 10.0.55.91 and destination ports of 80, we get all traffic that has both of these.
However, if we additionally click on 10.0.0.180, we get all port 80 traffic with either of these sources. This means that filters in the same field (category) are inclusive, while filters across fields are exclusive.
You might have also noticed there are +/- buttons to the right of a field (category). This allows for the number of results in a facet to be seen.
If you click on the “+” (plus sign), you will see the number of values increase to you hit its maximum of fifty (50). This limit is arbitrary. The “-“ (minus sign) has the same effect but in reverse, removing fields from view.
Facets are an easy way to focus in on data and see the most common responses by field.