*To*: rodrigo@xxxxxxxxxxxxxxxxxxx*Subject*: [dennou-ruby:003649] Re: JRuby NetCDF-3 support*From*: Takeshi Horinouchi <horinout@xxxxxxxxxxxxxxxxx>*Date*: Fri, 09 Aug 2013 10:07:18 +0900*Cc*: horinout@xxxxxxxxxxxxxxxxx, dennou-ruby@xxxxxxxxxxx

Dear Mr Botafogo, Thank you for sharing the info. It is nice to hear that NetCDF can be accessed via JRuby. (so I forward your message to the GFD Dennou Ruby mailing list; cc'ed) By the way, is the user inteface of MDArray is basically the same as NArray? Is Colt a good library? regards, Takeshi Horinouchi > Dear Mr. Takeshi Horinouchi, > > I've used Dennou Club NetCDF-3 previously and I was inspired by your work. > I've just implemented a first version of NetCDF-3 for JRuby, based on > NetCDF-java libraries from UCAR. I thought you might be interested in > knowing it, so I'm including the announcement about this software. > > Sincerely, > > Rodrigo Botafogo > > > > > Announcement > ============ > > MDArray version 0.5.4 has Just been released. MDArray is a multi > dimensional array implemented > for JRuby inspired by NumPy (www.numpy.org) and Masahiro Tanaka's Narray ( > narray.rubyforge.org). > MDArray stands on the shoulders of Java-NetCDF and Parallel Colt. At this > point MDArray has > libraries for mathematical, trigonometric and descriptive statistics > methods. > > NetCDF-Java Library is a Java interface to NetCDF files, as well as to many > other types of > scientific data formats. It is developed and distributed by Unidata > (http://www.unidata.ucar.edu). > > Parallel Colt ( > http://grepcode.com/snapshot/repo1.maven.org/maven2/net.sourceforge.parallelcolt/ > parallelcolt/0.10.0/) is a multithreaded version of Colt ( > http://acs.lbl.gov/software/colt/). > Colt provides a set of Open Source Libraries for High Performance > Scientific and Technical > Computing in Java. Scientific and technical computing is characterized by > demanding problem > sizes and a need for high performance at reasonably small memory footprint. > > For more information and (some) documentation please go to: > https://github.com/rbotafogo/mdarray/wiki > > What's new: > =========== > > NetCDF-3 File Support > --------------------- > > From Wikipedia, the free encyclopedia: > > "NetCDF (Network Common Data Form) is a set of software libraries and > self-describing, > machine-independent data formats that support the creation, access, and > sharing of array-oriented > scientific data. The project homepage is hosted by the Unidata program at > the University > Corporation for Atmospheric Research (UCAR). They are also the chief source > of netCDF software, > standards development, updates, etc. The format is an open standard. NetCDF > Classic and 64-bit > Offset Format are an international standard of the Open Geospatial > Consortium. > > The project is actively supported by UCAR. Version 4.0 (released in 2008) > allows the use of the > HDF5 data file format. Version 4.1 (2010) adds support for C and Fortran > client access to > specified subsets of remote data via OPeNDAP. > > The format was originally based on the conceptual model of the Common Data > Format developed by > NASA, but has since diverged and is not compatible with it." > > This version of MDArray implements NetCDF-3 file support only. NetCDF-4 is > not yet supported. At > the end of this announcement we show the MDArray implementation of the > NetCDF-3 file writing > from the tutorial at: > http://www.unidata.ucar.edu/software/netcdf-java/tutorial/NetcdfWriting.html > > > MDArray and SciRuby: > ==================== > > MDArray subscribes fully to the SciRuby Manifesto (http://sciruby.com/). > > 迭uby has for some time had no equivalent to the beautifully constructed > NumPy, SciPy, and > matplotlib libraries for Python. > > We believe that the time for a Ruby science and visualization package has > come. Sometimes > when a solution of sugar and water becomes super-saturated, from it > precipitates a pure, > delicious, and diabetes-inducing crystal of sweetness, induced by no more > than the tap of a > finger. So is occurring now, we believe, with numeric and visualization > libraries for Ruby.〓> > MDArray main properties are: > ============================ > > + Homogeneous multidimensional array, a table of elements (usually > numbers), all of the > same type, indexed by a tuple of positive integers; > + Easy calculation for large numerical multi dimensional arrays; > + Basic types are: boolean, byte, short, int, long, float, double, string, > structure; > + Based on JRuby, which allows importing Java libraries; > + Operator: +,-,*,/,%,**, >, >=, etc.; > + Functions: abs, ceil, floor, truncate, is_zero, square, cube, fourth; > + Binary Operators: &, |, ^, ~ (binary_ones_complement), <<, >>; > + Ruby Math functions: acos, acosh, asin, asinh, atan, atan2, atanh, cbrt, > cos, erf, exp, > gamma, hypot, ldexp, log, log10, log2, sin, sinh, sqrt, tan, tanh, neg; > + Boolean operations on boolean arrays: and, or, not; > + Fast descriptive statistics from Parallel Colt (complete list found > bellow); > + Easy manipulation of arrays: reshape, reduce dimension, permute, > section, slice, etc.; > + Support for reading and writing NetCDF-3 files; > + Reading of two dimensional arrays from CSV files (mainly for debugging > and simple testing > purposes); > + StatList: a list that can grow/shrink and that can compute Parallel Colt > descriptive > statistics; > + Experimental lazy evaluation (still slower than eager evaluation). > > Descriptive statistics methods imported from Parallel Colt: > =========================================================== > > + auto_correlation, correlation, covariance, durbin_watson, frequencies, > geometric_mean, > + harmonic_mean, kurtosis, lag1, max, mean, mean_deviation, median, min, > moment, moment3, > + moment4, pooled_mean, pooled_variance, product, quantile, > quantile_inverse, > + rank_interpolated, rms, sample_covariance, sample_kurtosis, > sample_kurtosis_standard_error, > + sample_skew, sample_skew_standard_error, sample_standard_deviation, > sample_variance, > + sample_weighted_variance, skew, split, standard_deviation, > standard_error, sum, > + sum_of_inversions, sum_of_logarithms, sum_of_powers, > sum_of_power_deviations, > + sum_of_squares, sum_of_squared_deviations, trimmed_mean, variance, > weighted_mean, > + weighted_rms, weighted_sums, winsorized_mean. > > Double and Float methods from Parallel Colt: > ============================================ > > + acos, asin, atan, atan2, ceil, cos, exp, floor, greater, IEEEremainder, > inv, less, lg, > + log, log2, rint, sin, sqrt, tan. > > Double, Float, Long and Int methods from Parallel Colt: > ======================================================= > > + abs, compare, div, divNeg, equals, isEqual (is_equal), isGreater > (is_greater), > + isles (is_less), max, min, minus, mod, mult, multNeg (mult_neg), > multSquare (mult_square), > + neg, plus (add), plusAbs (plus_abs), pow (power), sign, square. > > Long and Int methods from Parallel Colt > ======================================= > > + and, dec, factorial, inc, not, or, shiftLeft (shift_left), > shiftRightSigned > (shift_right_signed), shiftRightUnsigned (shift_right_unsigned), xor. > > MDArray installation and download: > ================================== > > + Install Jruby > + jruby 亡 gem install mdarray > > MDArray Homepages: > ================== > > + http://rubygems.org/gems/mdarray > + https://github.com/rbotafogo/mdarray/wiki > > Contributors: > ============= > Contributors are welcome. > > MDArray History: > ================ > > + 07/08/2013: Version 0.5.4 - Support for reading and writing NetCDF-3 > files > + 24/06/2013: Version 0.5.3 Over 90% Performance improvements for > methods imported > from Parallel Colt and over 40% performance improvements for all > other methods > (implemented in Ruby); > + 16/05/2013: Version 0.5.0 - All loops transferred to Java with over 50% > performance > improvements. Descriptive statistics from Parallel Colt; > + 19/04/2013: Version 0.4.3 - Fixes a simple, but fatal bug in 0.4.2. No > new features; > + 17/04/2013: Version 0.4.2 - Adds simple statistics and boolean > operators; > + 05/04/2013: Version 0.4.0 Initial release. > > NetCDF-3 Writing with MDArray API > ================================= > > require 'mdarray' > > class NetCDF > > attr_reader :dir, :filename, :max_strlen > > > #--------------------------------------------------------------------------------------- > # > > #--------------------------------------------------------------------------------------- > > def initialize > @dir = "~/tmp" > @filename1 = "testWriter" > @filename2 = "testWriteRecord2" > @max_strlen = 80 > end > > > #--------------------------------------------------------------------------------------- > # Define the NetCDF-3 file > > #--------------------------------------------------------------------------------------- > > def define_file > > # We pass the directory, filename, filetype and optionaly the > outside_scope. > # > # I'm implementing in cygwin, so the need for method cygpath that > converts the > # directory name to a Windows name. In another environment, just pass > the directory > # name. > # > # Inside a block we have another scope, so the block cannot access any > variables, etc. > # from the ouside scope. If we pass the outside scope, in this case we > are passing self, > # we can access variables in the outside scope by using > @outside_scope.<variable>. > NetCDF.define(cygpath(@dir), @filename1, "netcdf3", self) do > > # add dimensions > dimension "lat", 64 > dimension "lon", 128 > > # add variables and attributes > # add Variable double temperature(lat, lon) > variable "temperature", "double", [@dim_lat, @dim_lon] > variable_att @var_temperature, "units", "K" > variable_att @var_temperature, "scale", [1, 2, 3] > > # add a string-value variable: char svar(80) > # note that this is created as a scalar variable although in NetCDF-3 > there is no > # string type and the string has to be represented as a char type. > variable "svar", "string", [], {:max_strlen => > @outside_scope.max_strlen} > > # add a 2D string-valued variable: char names(names, 80) > dimension "names", 3 > variable "names", "string", [@dim_names], {:max_strlen => > @outside_scope.max_strlen} > > # add a scalar variable > variable "scalar", "double", [] > > # add global attributes > global_att "yo", "face" > global_att "versionD", 1.2, "double" > global_att "versionF", 1.2, "float" > global_att "versionI", 1, "int" > global_att "versionS", 2, "short" > global_att "versionB", 3, "byte" > > end > > end > > > #--------------------------------------------------------------------------------------- > # write data on the above define file > > #--------------------------------------------------------------------------------------- > > def write_file > > NetCDF.write(cygpath(@dir), @filename1, self) do > > temperature = find_variable("temperature") > shape = temperature.shape > data = MDArray.fromfunction("double", shape) do |i, j| > i * 1_000_000 + j * 1_000 > end > write(temperature, data) > > svar = find_variable("svar") > write_string(svar, "Two pairs of ladies stockings!") > > names = find_variable("names") > # careful here with the shape of a string variable. A string > variable has one > # more dimension than it should as there is no string type in > NetCDF-3. As such, > # if we look as names' shape it has 2 dimensions, be we need to > create a one > # dimension string array. > data = MDArray.string([3], ["No pairs of ladies stockings!", > "One pair of ladies stockings!", > "Two pairs of ladies stockings!"]) > write_string(names, data) > > # write scalar data > scalar = find_variable("scalar") > write(scalar, 222.333 ) > > end > > end > > > #--------------------------------------------------------------------------------------- > # Define a file for writing one record at a time > > #--------------------------------------------------------------------------------------- > > def define_one_at_time > > NetCDF.define(cygpath(@dir), @filename2, "netcdf3", self) do > > dimension "lat", 3 > dimension "lon", 4 > # zero sized dimension is an unlimited dimension > dimension "time", 0 > > variable "lat", "float", [@dim_lat] > variable_att @var_lat, "units", "degree_north" > > variable "lon", "float", [@dim_lon] > variable_att @var_lon, "units", "degree_east" > > variable "rh", "int", [@dim_time, @dim_lat, @dim_lon] > variable_att @var_rh, "long_name", "relative humidity" > variable_att @var_rh, "units", "percent" > > variable "T", "double", [@dim_time, @dim_lat, @dim_lon] > variable_att @var_t, "long_name", "surface temperature" > variable_att @var_t, "units", "degC" > > variable "time", "int", [@dim_time] > variable_att @var_time, "units", "hours since 1990-01-01" > > end > > end > > > #--------------------------------------------------------------------------------------- > # Define a file for writing one record at a time > > #--------------------------------------------------------------------------------------- > > def write_one_at_time > > NetCDF.write(cygpath(@dir), @filename2, self) do > > lat = find_variable("lat") > lon = find_variable("lon") > > # write non recored data to the variables > write(lat, MDArray.float([3], [41, 40, 39])) > write(lon, MDArray.float([4], [-109, -107, -105, -103])) > > # get record variables from file > rh = find_variable("rh") > time = find_variable("time") > t = find_variable("T") > > # there is no method find_dimension for NetcdfFileWriter, so we need > to get the > # dimension from a variable. > rh_shape = rh.shape > dim_lat = rh_shape[1] > dim_lon = rh_shape[2] > > (0...10).each do |time_idx| > > # fill rh_data array > rh_data = MDArray.fromfunction("int", [dim_lat, dim_lon]) do |lat, > lon| > time_idx * lat * lon > end > # reshape rh_data so that it has the same shape as rh variable > # Method reshape! reshapes the array in-place without data copying. > rh_data.reshape!([1, dim_lat, dim_lon]) > > # fill temp_data array > temp_data = MDArray.fromfunction("double", [dim_lat, dim_lon]) do > |lat, lon| > time_idx * lat * lon / 3.14159 > end > # reshape temp_data array so that it has the same shape as temp > variable. > temp_data.reshape!([1, dim_lat, dim_lon]) > > # write the variables > write(time, MDArray.int([1], [time_idx * 12]), [time_idx]) > write(rh, rh_data, [time_idx, 0, 0]) > write(t, temp_data, [time_idx, 0, 0]) > > end # End time_idx loop > > end > > end > > end > > netcdf = NetCDF.new > netcdf.define_file > netcdf.write_file > netcdf.define_one_at_time > netcdf.write_one_at_time > > > > > -- > Rodrigo Botafogo > Integrando TI ao seu neg〓io Takeshi Horinouchi Faculty of Environmental Earth Science, Hokkaido University N10W5 Sapporo, Hokkaido 060-0810, Japan

