Skip to content

data

Arrow

🔗 https://arrow.apache.org

Package description

Apache Arrow (incl. PyArrow Python bindings), a cross-language development platform for in-memory data.

Use latest version

module load Arrow

Use specific version

module load Arrow/6.0.0-foss-2021b
module load Arrow/8.0.0-foss-2022a
module load Arrow/11.0.0-gfbf-2022b

BeautifulSoup

🔗 https://www.crummy.com/software/BeautifulSoup

Package description

Beautiful Soup is a Python library designed for quick turnaround projects like screen-scraping.

Use latest version

module load BeautifulSoup

Use specific version

module load BeautifulSoup/4.10.0-GCCcore-11.3.0

DBD-mysql

🔗 https://metacpan.org/pod/distribution/DBD-mysql/lib/DBD/mysql.pm

Package description

Perl binding for MySQL

Use latest version

module load DBD-mysql

Use specific version

module load DBD-mysql/4.050-GCC-11.2.0

DB_File

🔗 https://perldoc.perl.org/DB_File.html

Package description

Perl5 access to Berkeley DB version 1.x.

Use latest version

module load DB_File

Use specific version

module load DB_File/1.857-GCCcore-11.2.0

GDAL

🔗 https://www.gdal.org

Package description

GDAL is a translator library for raster geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single abstract data model to the calling application for all supported formats. It also comes with a variety of useful commandline utilities for data translation and processing.

Use latest version

module load GDAL

Use specific version

module load GDAL/3.3.2-foss-2021b
module load GDAL/3.5.0-foss-2022a
module load GDAL/3.6.2-foss-2022b

HDF

🔗 https://www.hdfgroup.org/products/hdf4/

Package description

HDF (also known as HDF4) is a library and multi-object file format for storing and managing data between machines.

Use latest version

module load HDF

Use specific version

module load HDF/4.2.15-GCCcore-11.2.0
module load HDF/4.2.15-GCCcore-11.3.0
module load HDF/4.2.15-GCCcore-12.2.0

HDF5

🔗 https://portal.hdfgroup.org/display/support

Package description

HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data.

Use latest version

module load HDF5

Use specific version

module load HDF5/1.10.5-gompi-2019b
module load HDF5/1.10.5-gompic-2019b
module load HDF5/1.10.7-gompi-2020b
module load HDF5/1.10.7-gompi-2021a
module load HDF5/1.10.7-gompic-2020b
module load HDF5/1.10.8-gompi-2021b
module load HDF5/1.12.1-gompi-2021b
module load HDF5/1.12.2-gompi-2022a
module load HDF5/1.14.0-gompi-2022b

Hydra

🔗 https://hydra.cc/

Package description

Hydra is an open-source Python framework that simplifies the development of research and other complex applications. The key feature is the ability to dynamically create a hierarchical configuration by composition and override it through config files and the command line. The name Hydra comes from its ability to run multiple similar jobs - much like a Hydra with multiple heads.

Use latest version

module load Hydra

Use specific version

module load Hydra/1.1.1-GCCcore-10.3.0
module load Hydra/1.2.0-GCCcore-11.3.0

Jansson

🔗 https://www.digip.org/jansson/

Package description

Jansson is a C library for encoding, decoding and manipulating JSON data. Its main features and design principles are: * Simple and intuitive API and data model * Comprehensive documentation * No dependencies on other libraries * Full Unicode support (UTF-8) * Extensive test suite

Use latest version

module load Jansson

Use specific version

module load Jansson/2.13.1-GCC-11.2.0
module load Jansson/2.14-GCC-11.3.0

LAME

🔗 http://lame.sourceforge.net/

Package description

LAME is a high quality MPEG Audio Layer III (MP3) encoder licensed under the LGPL.

Use latest version

module load LAME

Use specific version

module load LAME/3.100-GCCcore-8.3.0
module load LAME/3.100-GCCcore-10.2.0
module load LAME/3.100-GCCcore-10.3.0
module load LAME/3.100-GCCcore-11.2.0
module load LAME/3.100-GCCcore-11.3.0
module load LAME/3.100-GCCcore-12.2.0

MariaDB

🔗 https://mariadb.org/

Package description

MariaDB is an enhanced, drop-in replacement for MySQL. Included engines: myISAM, Aria, InnoDB, RocksDB, TokuDB, OQGraph, Mroonga.

Use latest version

module load MariaDB

Use specific version

module load MariaDB/10.6.4-GCC-11.2.0

PnetCDF

🔗 https://parallel-netcdf.github.io/

Package description

Parallel netCDF: A Parallel I/O Library for NetCDF File Access

Use latest version

module load PnetCDF

Use specific version

module load PnetCDF/1.12.2-gompi-2020b
module load PnetCDF/1.12.2-gompic-2020b
module load PnetCDF/1.12.3-gompi-2021b

PyTables

🔗 https://www.pytables.org

Package description

PyTables is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data. PyTables is built on top of the HDF5 library, using the Python language and the NumPy package. It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code (generated using Cython), makes it a fast, yet extremely easy to use tool for interactively browsing, processing and searching very large amounts of data. One important feature of PyTables is that it optimizes memory and disk resources so that data takes much less space (specially if on-flight compression is used) than other solutions such as relational or object oriented databases.

Use latest version

module load PyTables

Use specific version

module load PyTables/3.8.0-foss-2022a

XML-LibXML

🔗 https://metacpan.org/pod/distribution/XML-LibXML/LibXML.pod

Package description

Perl binding for libxml2

Use latest version

module load XML-LibXML

Use specific version

module load XML-LibXML/2.0207-GCCcore-11.2.0

dask

🔗 https://dask.org/

Package description

Dask natively scales Python. Dask provides advanced parallelism for analytics, enabling performance at scale for the tools you love.

Use latest version

module load dask

Use specific version

module load dask/2021.2.0-foss-2020b
module load dask/2021.2.0-fosscuda-2020b
module load dask/2022.1.0-foss-2021b

dill

🔗 https://pypi.org/project/dill/

Package description

dill extends python's pickle module for serializing and de-serializing python objects to the majority of the built-in python types. Serialization is the process of converting an object to a byte stream, and the inverse of which is converting a byte stream back to on python object hierarchy.

Use latest version

module load dill

Use specific version

module load dill/0.3.6-GCCcore-11.3.0

h5py

🔗 https://www.h5py.org/

Package description

HDF5 for Python (h5py) is a general-purpose Python interface to the Hierarchical Data Format library, version 5. HDF5 is a versatile, mature scientific software library designed for the fast, flexible storage of enormous amounts of data.

Use latest version

module load h5py

Use specific version

module load h5py/2.10.0-fosscuda-2019b-Python-3.7.4
module load h5py/3.1.0-foss-2020b
module load h5py/3.1.0-fosscuda-2020b
module load h5py/3.6.0-foss-2021b
module load h5py/3.7.0-foss-2022a
module load h5py/3.8.0-foss-2022b

netCDF

🔗 https://www.unidata.ucar.edu/software/netcdf/

Package description

NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.

Use latest version

module load netCDF

Use specific version

module load netCDF/4.7.1-gompi-2019b
module load netCDF/4.7.1-gompic-2019b
module load netCDF/4.7.4-gompi-2020b
module load netCDF/4.7.4-gompic-2020b
module load netCDF/4.8.1-gompi-2021b
module load netCDF/4.9.0-gompi-2022a
module load netCDF/4.9.0-gompi-2022b

netCDF-Fortran

🔗 https://www.unidata.ucar.edu/software/netcdf/

Package description

NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.

Use latest version

module load netCDF-Fortran

Use specific version

module load netCDF-Fortran/4.5.3-gompi-2020b
module load netCDF-Fortran/4.5.3-gompi-2021b
module load netCDF-Fortran/4.5.3-gompic-2020b

pugixml

🔗 https://pugixml.org/

Package description

pugixml is a light-weight C++ XML processing library

Use latest version

module load pugixml

Use specific version

module load pugixml/1.12.1-GCCcore-11.2.0
module load pugixml/1.12.1-GCCcore-11.3.0

pycocotools

🔗 https://pypi.org/project/pycocotools

Package description

Official APIs for the MS-COCO dataset

Use latest version

module load pycocotools

Use specific version

module load pycocotools/2.0.4-foss-2021a
module load pycocotools/2.0.5-foss-2022a

scikit-learn

🔗 https://scikit-learn.org/stable/index.html

Package description

Scikit-learn integrates machine learning algorithms in the tightly-knit scientific Python world, building upon numpy, scipy, and matplotlib. As a machine-learning module, it provides versatile tools for data mining and analysis in any field of science and engineering. It strives to be simple and efficient, accessible to everybody, and reusable in various contexts.

Use latest version

module load scikit-learn

Use specific version

module load scikit-learn/0.23.2-foss-2020b
module load scikit-learn/0.23.2-fosscuda-2020b
module load scikit-learn/1.0.1-foss-2021b
module load scikit-learn/1.1.2-foss-2022a