M4 20Hz Cleaned or Rotated Data NOT always 12,000 Records

Discussion of meteorological data.

Moderators: Bonnie.Jonkman, Andy.Clifton

Everett.Perry
Posts: 33
Joined: Tue Jan 29, 2013 9:53 am
Organization: Texas Tech University
Location: Oregon

M4 20Hz Cleaned or Rotated Data NOT always 12,000 Records

Postby Everett.Perry » Fri Mar 08, 2013 10:47 am

Hello,

I recently started using the M4 20Hz MAT files and noticed an inconsistency with the record lengths for some of the Sonic variables. The NWTC website indicates that cleaned or rotated data will always be 12,000 records long (as per software version 1.21). This does not appear to be the case. The following details are from a single M4 20Hz MAT file VER 1.23. Notice that some of the cleaned or rotated variables contain 12,000 records while other cleaned or rotated variables do not.

Notice that many of the 30m variables are fine (contain 12,000 records). Also notice that the variable: "Sonic_temp_30" should probably be "Sonic_Temp_30" (capital T). I only mention this typo because 30m variables seem to be different in this file (may help with troubleshooting).

I know the formatting below is terrible! (couldn't seem to make it much better, sorry)

Current File Name: 10122_01_10_00_020.mat

tower.processing.code

date: [2013 1 29 17 0 0]
version: 1.2300

Var_Name Var_Length
Sonic_Temp_100 11999
Sonic_Temp_131 11999
Sonic_Temp_15 11999
Sonic_Temp_50 11999
Sonic_Temp_76 11999
Sonic_Temp_clean_100m 11999
Sonic_Temp_clean_131m 11999
Sonic_Temp_clean_15m 11999
Sonic_Temp_clean_30m 12000
Sonic_Temp_clean_50m 11999
Sonic_Temp_clean_76m 11999
Sonic_Temp_rotated_100m 11999
Sonic_Temp_rotated_131m 11999
Sonic_Temp_rotated_15m 11999
Sonic_Temp_rotated_30m 12000
Sonic_Temp_rotated_50m 11999
Sonic_Temp_rotated_76m 11999
Sonic_cleaned_timestamp 11999
Sonic_rotated_timestamp 12000
Sonic_temp_30 11999
Sonic_u_100m 11999
Sonic_u_131m 11999
Sonic_u_15m 11999
Sonic_u_30m 12000
Sonic_u_50m 11999
Sonic_u_76m 11999
Sonic_v_100m 11999
Sonic_v_131m 11999
Sonic_v_15m 11999
Sonic_v_30m 12000
Sonic_v_50m 11999
Sonic_v_76m 11999
Sonic_w_100m 11999
Sonic_w_131m 11999
Sonic_w_15m 11999
Sonic_w_30m 12000
Sonic_w_50m 11999
Sonic_w_76m 11999
Sonic_x_100 11999
Sonic_x_131 11999
Sonic_x_15 11999
Sonic_x_30 11999
Sonic_x_50 11999
Sonic_x_76 11999
Sonic_x_clean_100m 11999
Sonic_x_clean_131m 11999
Sonic_x_clean_15m 11999
Sonic_x_clean_30m 12000
Sonic_x_clean_50m 11999
Sonic_x_clean_76m 11999
Sonic_y_100 11999
Sonic_y_131 11999
Sonic_y_15 11999
Sonic_y_30 11999
Sonic_y_50 11999
Sonic_y_76 11999
Sonic_y_clean_100m 11999
Sonic_y_clean_131m 11999
Sonic_y_clean_15m 11999
Sonic_y_clean_30m 12000
Sonic_y_clean_50m 11999
Sonic_y_clean_76m 11999
Sonic_z_100 11999
Sonic_z_131 11999
Sonic_z_15 11999
Sonic_z_30 11999
Sonic_z_50 11999
Sonic_z_76 11999
Sonic_z_clean_100m 11999
Sonic_z_clean_131m 11999
Sonic_z_clean_15m 11999
Sonic_z_clean_30m 12000
Sonic_z_clean_50m 11999
Sonic_z_clean_76m 11999
time_UTC 11999
-----------------------------------------------------

Another example file from the M4 20Hz data:10122_16_20_00_020.mat.
This file seems to be OK at 131m and 30m (12,000 records), other heights contain 11,997 records

Regards,
Everett
Everett Perry
PhD Candidate, National Wind Institute
Texas Tech University

Everett.Perry
Posts: 33
Joined: Tue Jan 29, 2013 9:53 am
Organization: Texas Tech University
Location: Oregon

Re: M4 20Hz Cleaned or Rotated Data NOT always 12,000 Record

Postby Everett.Perry » Sat Mar 09, 2013 9:24 am

I did some more work here. I looked at all variable lengths for the files in the following directory:
path: 'S:\Projects\MetData\M4Twr\2012\10\12'

The attached file is a tab delimited file that shows the variable lengths for these 144 files. Open with Excel or Notepad++ (no carriage returns so regular Notepad is ugly).

Everett
Attachments
M4_20Hz_Variable_Length.txt
Tab delimited, no carriage returns
(108.15 KiB) Downloaded 475 times
Everett Perry
PhD Candidate, National Wind Institute
Texas Tech University

Andy.Clifton
Posts: 83
Joined: Wed Feb 29, 2012 3:13 pm
Organization: NREL
Location: Boulder, CO
Contact:

Re: M4 20Hz Cleaned or Rotated Data NOT always 12,000 Record

Postby Andy.Clifton » Mon Apr 01, 2013 11:49 am

Hi Everett,

Thanks again for posting your comments to the forums, rather than sending me emails. This way everyone gets to learn about any issues with the data.

I took a look at the file you sent through. What appears to be happening is that for a given file, the raw data might only be 11,998 records long because we missed a sample at some point. Maintaining 20 Hz can be a real challenge sometimes. Then, when I go through the data processing routines I remap the sonic data to a continuous 20-Hz time series (note that where there are readings I don't resample, I only shift things by a hundredth of a second). That means I have a true 20-Hz clean signal. I can do this because I know that the data system was actually triggered at 20 Hz, but sometimes we have hiccoughs in getting the time stamp.

If a file ends short, I extend it using the mean of the time series.

So my original data might looked like this:
elapsed time [s], value
0, 3.4
0.05, 3.6
0.099, 3.75
0.15, 3.8
0.25, 4.0
...
599.90, 12
- END OF FILE -

And then it gets mapped to the nearest 0.05 second point and interpolated where there is no measurement:

0, 3.4
0.05, 3.6
0.1, 3.75 <- data remapped to 20-Hz time series
0.15, 3.8
0.25, 4.0
...
599.90, 12
599.95, 8.9 <- or whatever the mean is
- END OF FILE -

This only happens to the 'clean' sonic data in the 20-Hz files, so you only see this effect in some columns.
Andy Clifton, Ph.D.
Senior Engineer

Everett.Perry
Posts: 33
Joined: Tue Jan 29, 2013 9:53 am
Organization: Texas Tech University
Location: Oregon

Re: M4 20Hz Cleaned or Rotated Data NOT always 12,000 Record

Postby Everett.Perry » Mon Apr 01, 2013 12:13 pm

Perfect!

Thanks Andy
Everett Perry
PhD Candidate, National Wind Institute
Texas Tech University

Everett.Perry
Posts: 33
Joined: Tue Jan 29, 2013 9:53 am
Organization: Texas Tech University
Location: Oregon

Re: M4 20Hz Cleaned or Rotated Data NOT always 12,000 Record

Postby Everett.Perry » Fri Apr 26, 2013 10:00 am

Hi Andy

I think there could still be a problem with the timestamps for the M4 20Hz data. There is a mismatch between the number of records for some of the variables and the associated timestamp within the file.

For instance:

In my first post, I showed a long column of data with variable names and the associated number of records for the variables. Near the bottom of the list you will see that “Sonic_z_clean_30m” has 12,000 values while “Sonic_z_clean_50m” only has 11,999 values. Although these are both “cleaned” variables they have a different number of values. Also notice that the “Sonic_cleaned_timestamp” has 11,999 values. This issue will cause indexing errors and variable length errors in Matlab. A similar situation occurs with the rotated data.

For example, the code you posted in your “software version” post will not work for all files:

figure
plot(24*60*60*(time_UTC.val-time_UTC.val(1)),Sonic_z_15.val,'ko') % raw data
hold on
plot(Sonic_cleaned_timestamp.val,Sonic_z_clean_15m.val,'r+') % cleaned data
plot(Sonic_rotated_timestamp.val,Sonic_w_15m.val,'bx') % rotated data

The 2nd plot statement will throw a “vector length error” if I use the file “10122_00_00_00_020.mat”. The “Sonic_cleaned_timestamp” has 11,998 values while the “Sonic_z_clean_15m” has 12,000 values. It seems that the “Sonic_cleaned_timestamp” was not remapped to a 20Hz signal.


To make a long story short, here is my understanding when the data is at least 95% complete:

1) Cleaned variables should always have 12,000 values and should exactly match the number of values in the “Sonic_cleaned_timestamp” (currently not always true).
2) Rotated variables should always have 12,000 values and should exactly match the number of values in the “Sonic_rotated_timestamp” (currently not always true).
3) Raw variables should have the same number of values as the raw timestamp (time_UTC), (currently seems to be true).


I apologize if I am beating a dead horse here but after reading your timestamp QC algorithm, it does not seem like the situations I mentioned above should be possible when the number of samples is at least 95% of 12,000 samples?

Best regards,
Everett
Everett Perry
PhD Candidate, National Wind Institute
Texas Tech University

Andy.Clifton
Posts: 83
Joined: Wed Feb 29, 2012 3:13 pm
Organization: NREL
Location: Boulder, CO
Contact:

Re: M4 20Hz Cleaned or Rotated Data NOT always 12,000 Record

Postby Andy.Clifton » Mon Apr 29, 2013 6:06 pm

I think you found one of the special conditions under which my code breaks down.

As far as I can tell, what is happening there is that the data for one (or more) of the sonics for the 12,000th sample in the time series are NaN. Because there's no data, there's no last record created. I think this happens because I consider all sonics independently. This might take some time to fix! I'll post on here when I've figured out the solution.

FIRST EDIT:
  • The elapsed time should always be 11,999 points, because it's defined right at the start for all sonics together. It's defined as

    Code: Select all

    dt_full = [0:1/tower.daqfreq:(tower.windowsize-1)/tower.daqfreq]';

    where

    Code: Select all

    tower.daqfreq = 20;
    tower.windowsize = 12000;
  • The actual time series of measurements only includes the samples in the file, which can occasionally be less than that. So there's possibly a mismatch.

    Code: Select all

    dt_clean = [0:1/tower.daqfreq:max(dt)]';

    where dt is the time elapsed since the first timestamp in the data file:

    Code: Select all

    dt = (timestamp-timestamp(1,:))*60*60*24;

  • Interim solution: plot your data as plot(x(1:length(y)),y,'b-'). That should work while I figure out how to recode this. I may need to add some extra variables to the output.

SECOND EDIT
Still something not quite right here.
Andy Clifton, Ph.D.
Senior Engineer

Andy.Clifton
Posts: 83
Joined: Wed Feb 29, 2012 3:13 pm
Organization: NREL
Location: Boulder, CO
Contact:

Re: M4 20Hz Cleaned or Rotated Data NOT always 12,000 Record

Postby Andy.Clifton » Tue Apr 30, 2013 2:04 pm

To try to fix this I've modified the variables slightly. I have written out the clean and rotated time stamp for each data sonic, so there's no assumption of any of them being on a common time stamp. The timestamps are written out as Sonic_dt_clean_zm and Sonic_dt_rotated_zm.

I've uploaded some data with this format to http://wind.nrel.gov/MetData/M4Twr/V1.25RC/04293_20_40_00_030.mat.

Try using this code with the new data.

Code: Select all

figure
plot(Sonic_dt_clean_50m.val,Sonic_x_clean_50m.val)
hold on
plot(60*60*24*(time_UTC.val-time_UTC.val(1)),Sonic_x_50.val,'r.')
plot(Sonic_dt_rotated_50m.val,Sonic_u_50m.val,'g.')

If this solution works, I'll go ahead and reprocess the data.
Last edited by Andy.Clifton on Wed May 01, 2013 4:49 pm, edited 1 time in total.
Reason: tidied up
Andy Clifton, Ph.D.
Senior Engineer

Everett.Perry
Posts: 33
Joined: Tue Jan 29, 2013 9:53 am
Organization: Texas Tech University
Location: Oregon

Re: M4 20Hz Cleaned or Rotated Data NOT always 12,000 Record

Postby Everett.Perry » Thu May 02, 2013 8:53 am

I didn’t have any problems with this file. However, since this file was a full file to start with (12,000 records) I wonder if it might be a good idea to test the code on a few files that have some missing records? The following file-names would be a good test:

1) 10122_00_00_00_020.mat
2) 10122_01_10_00_020.mat
3) 10122_20_50_00_020.mat

I did notice that a few of the variable “labels” for the file “04293_20_40_00_030.mat” may not be correct. Also, quite a few of the variable “units” for the file “04293_20_40_00_030.mat” are not correct. I have posted an Excel file that identifies the issues.
Attachments
M4_20Hz_Label_or_Units_Problem.xlsx
Highlighted sections indicate label/units problems
(15.24 KiB) Downloaded 1774 times
Everett Perry
PhD Candidate, National Wind Institute
Texas Tech University

Andy.Clifton
Posts: 83
Joined: Wed Feb 29, 2012 3:13 pm
Organization: NREL
Location: Boulder, CO
Contact:

Re: M4 20Hz Cleaned or Rotated Data NOT always 12,000 Record

Postby Andy.Clifton » Thu May 02, 2013 12:32 pm

That was a very helpful spreadsheet, thanks! The example files you chose let me isolate the problem, which was that I was using a slightly different time stamp for the sonics than I should have been (too many variables starting with 'dt...'!). The impact of that should have been minimal, though. At worst the time stamp may have been a few samples off. It's now correct.

I've uploaded a new test file at http://wind.nrel.gov/MetData/M4Twr/V1_26RC2/10122_00_00_00_020.mat. That file is processed with a new code version (release candidate #2 for version 1.26). The changes include the following:
  • More uniform sonic anemometer variable names (you may need to update some scripts to use the new variable names)
  • Units for the sonic anemometers and for some other variables are now correct
Data processing has not been changed.
Andy Clifton, Ph.D.
Senior Engineer

Everett.Perry
Posts: 33
Joined: Tue Jan 29, 2013 9:53 am
Organization: Texas Tech University
Location: Oregon

Re: M4 20Hz Cleaned or Rotated Data NOT always 12,000 Record

Postby Everett.Perry » Thu May 02, 2013 1:41 pm

That is great news!

I was a little slow about responding last time, but I will run your new test file this evening and get you some feedback no later than first thing tomorrow morning.
Everett Perry
PhD Candidate, National Wind Institute
Texas Tech University

Everett.Perry
Posts: 33
Joined: Tue Jan 29, 2013 9:53 am
Organization: Texas Tech University
Location: Oregon

Re: M4 20Hz Cleaned or Rotated Data NOT always 12,000 Record

Postby Everett.Perry » Fri May 03, 2013 9:38 am

I didn't have any problems working with the new file. It seems like the record length issues are solved. I did notice one more possible "units" issue though for the DeltaT variables. Two are in "K" and one is in deg C.

DeltaT_134_88m =

val: [11998x1 double]
label: 'Delta T (134 m)'
units: 'K'
height: 134


DeltaT_26_3m =

val: [11998x1 double]
label: 'Delta T (3 m)'
units: '°C'
height: 3


DeltaT_88_26m =

val: [11998x1 double]
label: 'Delta T (26 m)'
units: 'K'
height: 26

Having said that, I would be very interested to see how file: "10122_01_10_00_020.mat" works with the version 1.26 fixes. This was the file that started crashing my code and alerted me that something wasn't quite right. I don't think it is worth holding up the version 1.26 release to test this one file though. If version 1.26 has no record length issues with file "10122_01_10_00_020.mat", then I think it would be safe to say that the record length issues are "fixed".
Everett Perry
PhD Candidate, National Wind Institute
Texas Tech University

Andy.Clifton
Posts: 83
Joined: Wed Feb 29, 2012 3:13 pm
Organization: NREL
Location: Boulder, CO
Contact:

Re: M4 20Hz Cleaned or Rotated Data NOT always 12,000 Record

Postby Andy.Clifton » Sun May 05, 2013 10:30 am

Thanks again for the feedback. I've released 1.26 with the corrections to the units that you identified (see this post). I've also tidied up the variables for M5.
Andy Clifton, Ph.D.
Senior Engineer


Return to “NWTC Wind Data”

Who is online

Users browsing this forum: No registered users and 1 guest