File specifications for providing data to WindESCo
Overview
The purpose of this document is to define the CSV format for data files used to transfer data to WindESCo.
Other formats are acceptable, but WindESCo may add a surcharge to transform the data into the required format.
Wide and Narrow Formats
There are two types of file formats: wide and narrow.
In wide format, each row begins with a timestamp, and each subsequent column has a signal value that corresponds to that timestamp. This format corresponds well with a database table. For example:
ts,signal1,signal2,signal3
2020-02-01T00:00:00Z,1,2,3
2020-02-01T00:00:01Z,4,5,6
2020-02-01T00:00:02Z,7,,9
...
In narrow format, each row contains a timestamp, a signal identifier (e.g. tag name), and a value. This format allows each signal to be time-stamped independently. When exporting from OSIsoft PI systems, a narrow format is typically used. For example:
ts,signal_name,value
2020-02-01T00:00:00Z,signal1,1
2020-02-01T00:00:00Z,signal2,2
2020-02-01T00:00:00Z,signal3,3
2020-02-01T00:00:01Z,signal1,1
2020-02-01T00:00:01Z,signal2,1
2020-02-01T00:00:01Z,signal3,4
...
Format Requirements
Comma delimiter
Data elements and headers shall be separated by commas.
Headers
There shall be only one row of headers, each header describing the column below it. No additional metadata shall be included anywhere in the file. Headers may be enclosed in quotes, but it’s not required.
Timestamps
Date and time stamps are combined into one column in the ISO 8601 format, e.g. 2021-02-09T14:32:35Z. Timestamps shall have a minimum of 1-second resolution. It is acceptable to have millisecond resolution, but not acceptable to have 1-minute resolution. Timestamps shall be in UTC. Timestamps may be enclosed in quotes, but it’s not required.
Number format
Numbers shall be formatted with no thousands separator, and decimals are notated with the period character, not the comma character. Numbers shall not be enclosed in quotes.
No strings in data
There shall be no strings in the data. For example, turbine state should be a number, not a name. Bad quality data is omitted rather than printing a “bad quality”.
Data shape
Each row of the file shall have the same number of columns, including the header row.
Data partitioning
Each file shall contain all signals for a single turbine. Each file shall not exceed 1 GB. If the time range of the data requires data that exceeds this limit, multiple files shall be provided.