Asking All Them Questions Download

Posted on  by 

Our new website is in development. Try it out at: https://alpha.physionet.org and give us feedback

Answers to Common Health Questions Appointments & Access; Contact Us; Share Facebook. There are some people that are born with certain genes that predispose them to stroke. One such condition would be CADASIL (cerebral autosomal-dominant arteriopathy with subcortical infarcts and leukoencephalopathy). We ask our insomnia patients not to.

The top questions are answered directly below. If your question is not among these, please see the rest of the FAQ below. It is very detailed and most likely has the answer you need.


Top questions


1. I am looking for [...] data or content.

You have several options:

  1. Browse the physiobank signal archives, which has the databases sorted by signal category.
  2. Use the keyword search, which is a google search through physionet. Type in keywords in the search bar at the top right of the web page. This searches the website’s content for your keywords.
  3. Use the physiobank record search. Specific instructions are on the page. For more information about the physiobank record search, see the page about the physiobank-index, the large metadata file the physiobank search is based on.
  4. Explore MIMIC-III, a massive healthcare dataset collected from over 40000 critical care patients. MIMIC-III is not part of PhysioBank, but a project in PhysionetWorks, therefore it is not completely openly accessible. Users must sign a data use agreement and apply for access. Some subsets of older versions of MIMIC are part of physiobank and can be found in the signal archives.

If you are looking for a very modern or unconventional type of recording, note that Physionet does not have every type of data. All databases hosted by Physionet are listed in the physiobank signal archives directory shown above.

*Do not email the Physionet webmaster asking them to find you data. All the search resources are mentioned above.

2.1. What do these file formats mean? Which files are the data and/or annotations?

The data and annotations in most PhysioBank databases are stored in a Waveform Database (WFDB) format, which contains two standard categories:

MIT Format
  • MIT Signal files (.dat) are binary files containing samples of digitized signals. These store the waveforms, but they cannot be interpreted properly without their corresponding header files. These files are in the form: RECORDNAME.dat.
  • MIT Header files (.hea) are short text files that describe the contents of associated signal files. These files are in the form: RECORDNAME.hea.
  • MIT Annotation files are binary files containing annotations (labels that generally refer to specific samples in associated signal files). Annotation files should be read with their associated header files. If you see files in a directory called RECORDNAME.dat, or RECORDNAME.hea, any other file with the same name but different extension, for example RECORDNAME.atr, is an annotation file for that record.
European Data Format (EDF)
  • EDF files contain digital signals stored in their standard international format. EDF files store their header information at the beginning of the file, as opposed to MIT format which has a separate header file. Since recent versions of the WFDB library can read them directly, EDF is a WFDB and PhysioBank-compatible format. EDF files may also have associated annotation files. For example if a directory contains RECORDNAME.edf and RECORDNAME.edf.qrs, the .qrs file is the annotation file associated with the record.
  • EDF+ files are EDF files that also contain annotations encoded as signals.

2.2. How do I read them?

Physionet provides the WFDB software package highly useful for reading, writing, and processing the above described WFDB files. See the WFDB Applications Guide for details about its many functionalities.

To read MIT format signals and annotations RECORDNAME.dat, RECORDNAME.hea, and RECORDNAME.qrs:

To read EDF format signals and annotations RECORDNAME.edf and RECORDNAME.edf.atr:

There is also the WFDB Matlab toolbox, a matlab implentation of the WFDB software package. See also the development repository

Finally there is the WFDB Python package which just contains functions to read MIT WFDB format signal and annotation files into python data structures. Release versions are hosted on pypi and can be installed from your terminal by calling: pip install wfdb.

3. How do I get a dataset into text/matlab/python so I can process it?

There are several ways:

  • Install and use our WFDB software package. It is a large collection of software for signal reading, writing, processing and automated analysis. See the WFDBApplications Guide for details about its many functionalities. Some basic commands include: rdsamp, rdann, wfdb2mat. For example, to convert a record into a text file, call: For more details, see How to obtain PhysioBank data in text format.
    To convert a record into a matlab matrix, call:
  • Install and use the WFDB Matlab toolbox, which is a matlab implentation of the WFDB software package.
  • Use the physiobank ATM. Under 'Input' select your database and record. Under 'Output/length' select 'to end'. Under 'Toolbox' select 'Export signals at .mat' or whatever format you want. Note that the page will show the commands from the WFDB software package used to generate the files/graph you request. It is highly recommended to download the WFDB software package full of useful and powerful commands.
  • Install and use the WFDB Python package, which contains python functions to read MIT WFDB signals and annotations into python.

4.1. How do I know if the signals are digital or physical?

Easy method - just have a look at the signal numbers:

  • If they are all integers in the range [-2^N, 2^N-1 ] or [ 0, 2^N ], they are probably digital. Compare the values to see if they are in the expected physiological range of the signal you are analyzing. For example, if the header states that the signal is an ECG stored in milivolts, which typically has an amplitude of about 2mV, a signal of integers ranging from -32000 to 32000 probably isn't giving you the physical ECG in milivolts...
  • If they are not integers then they are physical. Once again you can quickly compare the values to see if they are in the expected physiological range of the signal you are analyzing.

4.2. Does WFDB give digital or physical values?

  • The WFDB Software Package's (10.5.24) rdsamp produces digital values by default: You can use the -p option to obtain physical values instead:
  • The WFDB Matlab Toolbox's rdsamp produces physical values by default (different from the above). Set the 'rawunits' flag (default 1 for 64 bit double precision floating point physical values,) according to your preferences.
  • The WFDB Software Package's and the WFDB Matlab Toolbox's wfdb2mat both produce digital values. You can obtain physical values by using the WFDB Matlab Toolbox's rdmat function after calling wfdb2mat.
Note and warning about manually converting digital to physical units:

If you have digital values, you can manually convert all 2^-(N-1) into NAN, subtract the baseline and then divide by the gain for each channel to obtain physical units, where N=no. bits. But files using WFDB format 80 store integers from 0 to 256, which actually represent integers from -128 to 127, so you would have to first subtract 128, convert all -128 into NAN, and then subtract the stated baseline and divide by the gain. It is safer in general to use rdsamp -p or wfdb2mat + rdmat which accounts for these scenarios.

Creating WFDB files from text and wrsamp

Currently wrsamp only uses integer input values, which are directly written to the digital signal file. It is the reverse of rdsamp which reads digital values from signal files. All non-integers will be rounded off, so if you input a physical signal of decimals all under 0.5, the output will just be 0's. This is fine if you already happen to have the digital values in text format, but very troublesome if you only have analogue values.

  • One feature that may help in both instances is the -x option of wrsamp which multiplies each input channel by a specified factor before writing them to the signal file. Do not confuse this with the -G option which only affects writing the header file for interpreting the signal after it has been written. See the wrsamp man page (man wrsamp) for more details.
  • If you have matlab, you can use the mat2wfdb function from the WFDB Matlab Toolbox which automatically chooses and applies appropriate gains and offsets on input matlab signals before writing the output WFDB file.

4.3 Digital vs physical - Concepts of storing and representing information

Researchers want to analyze the actual value of signals, ie. the value of this ECG signal in milivolts. But to process information using computers, they must collect the signals via some capturing device which discretely samples, and digitizes the signals into 2^N levels, where N is the resolution of the device. Each sample captured requires N bits to store, and takes one of 2^N possible integer values. There is also information stored which allows the user/program to map these integers back to the physical values the device managed to capture given its resolution. For example if they have a 12 bit oscilloscope, they have 4096 levels to capture the range and details of the signal. A higher N allows us to resolve finer details, but requires more storage space per sample.

Because the user wants to analyze the actual value of the signals, we can map these digital values back to the original physical values the device managed to capture. These mapped values can be loaded into an environment like Python, C, or Matlab which has the double precision floating point (64 bit) variable type which can represent numbers and decimals to a very fine detail (2^64 = 1.8447e+19 levels of precision!). Then the user will have 'physical' values to process and apply algorithms on in their highly detailed 64 bit environment. Remember however that the original signal resolution is limited by the capturing device, and is not increased just by loading it into a 64 bit environment.

We say that signals are in 'physical units' when the values are used to represent the actual real life values as closely as possible, although obviously everything on the computer is digital and discrete rather than analogue and continuous. This includes our precious 64 bit double precision floating point values, but this is as close as we can get and already very close to the actual physical values, so we refer to them as 'physical'.

Binary files such as the WFDB.dat files store signal values as integers, using enough space per sample to retain the signal's original resolution, but not an excessive amount.

For example, if a 15 bit signal is collected via a capturing device, Physionet will likely store it as a 16 bit signal. Each 16 bit block stores an integer value between -2^15 and 2^15-1, and using the gain and offset stated in the header for each channel, the original physical signal can be mapped out for processing. If we know that the signal only has 15 bits of precision when it was recorded, why not store it as integers in a 16 bit file along with a small header text, rather than waste 4x as much space storing the physical signal using 64 bits per sample? Because the capturing device was exactly 15 bits, assigning more space to allow for storing values that fall between the original values will be wasted and won't make the signal more detailed. Imagine using 5TB vs 20TB of disk space to store the exact same information!

5. Help, the data are corrupt / How do I download the files?

No, they’re probably not corrupt. Did you left click on a digital signal file (.edf or .dat) storing the data? That makes your browser view it in text format, which makes no sense. See above for descriptions of file types.

If you want to download the file, right click -> save as. If you want to convert the WFDB or EDF files into another form, see the question about file changing file formats, above.

If you want to download an entire database at once, see the downloading-databases section.

6. What do the signals look like? Can I view them before I download them?

You can view all physiobank signals with Lightwave or with the Physiobank ATM.

7. How can I report a problem with Physionet?

If you are experiencing issues when using PhysioNet or if you have a suggestion for improvement, please raise an issue on our issue tracker.

To raise an issue, first navigate to the PhysioNet repository on GitHub. After logging in to GitHub, click on the 'Issues' tab, click 'New issue', add a title and description of the problem, and select the “Submit new issue” button.

Top

General

Sign-in, Accounts, and Passwords

Where is ...

Downloading

PhysioBank Files

Reading and Writing Digitized Signals

Reading and Writing Annotations

Software

Help!

General

How can I get an answer to my question?

Have you read this FAQ? If not, please take a fewminutes to do so. It answers many common questions.

Have you tried searching for key words using the Searchtool? All text on the PhysioNet web site is indexed and can be foundby searching for it. To do this, type one or more terms related toyour topic or question into the search box below, then click on the'Search' button to its right:

A similar search box and button appear at the top right corner of thisand almost every other page on PhysioNet.

If you have not found an answer to your question in the PhysioNetFAQ or by a PhysioNet search, you may wish toask your question by email. If everyone who took the time to ask aquestion by email first took the time to read How to AskQuestions the Smart Way, we would be able to take the time toanswer all of the questions we receive with the detailed and pertinentanswers they deserve. Since this will never happen, we give priorityin answering questions to those who have read this FAQ. How can wetell who has done so? That’s easy; we look for the magic word in thesubject line of the email. (Important: the author of “How to AskQuestions the Smart Way” cannot answer your questions; readhis disclaimer!)

What is all of this, anyway?

You’re looking at the PhysioNet web site, or one of its mirrors. Read more about PhysioNet and theNIH-sponsored research resource to which it belongs here.

We have large collections of physiologic signals(time series) and software that can be used tostudy these signals, and smaller but growing collections of research papers, tutorials,and reference materials thatrelate to the signals and software.

Who are you?

We are a diverse group of computer scientists, physicists,mathematicians, biomedical researchers, clinicians, and educators atMIT (Cambridge, MA,USA), the Beth Israel Deaconess Medical Center, and /Harvard MedicalSchool (Boston, MA, USA). Many of us have worked together for20 years or even longer on problems relating to characterizing andunderstanding the dynamics of human physiology and the implications ofdynamical change in diagnosis and treatment of pathophysiology.

PhysioNet receives contributions of data, software, publications, andtutorials from researchers worldwide; see the PhysioNetContributors page for a list.

Why is PhysioNet here?

You can’t learn everything there is to know about snow by studying asingle snowflake, or even a few hundred of them. In much the same way(and for some of the same reasons), physiologic signals display astonishingdiversity, between individuals and even within individual subjects over time.To study them seriously requires large amounts of data that aredifficult and expensive to gather and to characterize, and software thatcan be flexibly and efficiently modified to meet the unique requirementsof new research.

PhysioNet is here first of all because we (see the previousquestion) needed to gather such data and to design such software forour own work. Having done so, we believe that other researchersshould not be forced to do the same, and that by making our data andsoftware available, others should be able to explore them, to develop,test, and refine hypotheses; in short, to do investigations that wouldnot be possible otherwise.

Many researchers around the worldshare this vision of open science, in which investigators who needdata with which to test their ideas can bootstrap their studies usinglarge, freely available, and well-understood data collections, and inwhich investigators wishing to explore their data using a wide varietyof methods can find verifiable, open-source, reference implementationsof analysis software that can be adapted to their own studies.PhysioNet began in 1999 with our own collections of data and software,but its archives continue to grow in scope and depth thanks to thecontributions of many others.

Who can use data and software from PhysioNet?

These materials have placed here for the use of researchers anywherein the world (our visitors in the month of December 2009 came from at least149 countries and territories on every continent, including Antarctica).Many of them are biomedical and clinical researchers in academia and industry,but others include physicists, mathematicians, computer scientists, educators,graduate and undergraduate university students, and even secondary schoolstudents.

Have the PhysioBank data been fully deidentified (anonymized), and may they be used without (further) IRB approval?

Yes.

If you are planning to contribute data to PhysioNet, it is your responsibilityto ensure that they have been fully deidentified before transmitting them tous. Please review our guidelines forcontributors. Our software for deidentificationof free text medical records may be helpful while preparing data to becontributed.

Is all of this really free?

Yes.

We encourage contributions of data and software to PhysioNet, but only ifcontributors are willing to allow their contributions to be used freely.See our guidelines for contributors and ourcopying policy.

How can I buy a copy of ...?

See the answer to the previous question.

Printed copies of some of our books are now available at thePhysioNet Bookstore.

Please send me a copy of ...

Everything we have is free, and can be freely redistributed. You can downloadit yourself, or you can ask a friend to download it for you. We understandthat web access can be slow or expensive in some locations; pleaseunderstand that preparing and mailing materials from this web site toindividual users would also be slow and expensive.

For downloading tips, read the questions and answers below, beginning withHow can I download binary files?.

What are the license terms?

The software is licensed under the GNU Public License(GPL), or (if noted in the source files) other licenses that conform to the OpenSource Definition. These licenses permit verbatim copying andredistribution of the source files, and generally grant other permissions aswell. For further details, see Can I use your codein my commercial application? (below).

There is nothing analogous to the GPL for data, but we permit copyingand redistribution of unaltered data from this site without restrictions,in the spirit of the GPL. We do not allow distribution of altered dataexcept under conditions that make it clear that the data have been altered,because it is very important that users should be able to distinguish betweenoriginal data from this site and modified versions of those data.

Other materials from this site (books, tutorials, papers, and commentary)may be reproduced freely, with appropriate credit to the original authors.

See the PhysioNet Copying Policy for furtherdetails.

Is this software Y2K-compliant?

Yes. See our statement of Y10K compliance.

This really isn’t a frequently-asked question any more. The last personwho asked it sent his question by email, dated 1 January 103.

My connection is slow. Is there a mirror?

Yes. See Mirrors for a list.

Can I set up a mirror?

Yes. Please use rsync as described inHow to set up a mirror of PhysioNet.

Will you post a link to my web site?

Probably not, unless it is directly relevant to the content of PhysioNet.Most external links on this site reference publications and other materialsthat provide additional information, examples of use, or context for PhysioBankdata or PhysioToolkit software. We also maintain short and highly selectivelists of other data andsoftware resources likely to be ofinterest to PhysioNet visitors. These lists are limited to non-commercialsites that provide access unavailable elsewhere to collections of physiologicsignals or related data, or open-source software for study of such data.

Sign-in, Accounts, and Passwords

Why should I sign in?

You are not required to login in order to use PhysioBank, PhysioToolkit, or thePhysioNet Library, all of which can be accessed freely. Use of PhysioNetWorksis also free, but it requires logging in.

PhysioNetWorks workspaces are available to members of the PhysioNet communityfor works in progress that will be made publicly available via PhysioNet whencomplete. Unlike other areas of PhysioNet, these workspaces arepassword-protected.

Why would I need an account and how do I get one?

Most visitors don’t need accounts (see the previous question).

If you wish to create a PhysioNetWorks project, to join an existing one, or toparticipate in an annual PhysioNet/Computing in Cardiology Challenge, you willneed an account and a password in order to establish your identity and gainaccess to password-protected workspaces. Owners of PhysioNetWorks projects,which are works in progress, may allow access to invited collaborators only, orthey may allow access to PhysioNetWorks members only under the terms of a DataUse Agreement (DUA).

The MIMIC II Clinical Database is an example of aPhysioNetWorks project that requires a password and DUA for access.

To create an account, go to the PhysioNetWorks loginpage, enter your email address, and click on ‘Create account’.Instructions for setting up your account and choosing your password are sentimmediately to the address you enter, with the subject line “PhysioNetWorkslogin' and the sender address “DoNotReply' at physionet.org, so be sure toenter a valid email address at which you can receive it. If it doesn’t arrivewithin a few minutes, check that your spam filter has not discarded it. If youforget your password, or wish to change it at any time, simply return to thelogin page and request a new one.

Since your access to PhysioNet’s restricted or protected content will beinterrupted if you lose both your password and access to your registeredemail address, we suggest not using a temporary address as your accountname.

I can’t log in!

Check your assumptions: most users don’t need to log in (see theprevious question and answer).

If you really do need to log in, go to the PhysioNetWorkslogin page and follow the instructions there.

How can I change my PhysioNetWorks password?
How can I change my MIMIC II Explorer/Query Builder password?

PhysioNetWorks users:Go to the PhysioNetWorkslogin page, enter your email address, and click “Reset password”.Follow the instructions that will be sent to your email address by theautoresponder within a minute or two.

MIMIC Query Builder users: A dedicated server,https://querybuilder-lcp.mit.edu/, providesaccess to the MIMIC Query Builder. User your PhysioNetWorks username and password to log in.

Where is ...

Where can I find the specific type of data I need?

Some of the most popular versions of this question are answered inthis section; read it first.

The next place to look is inthe PhysioBank Archive Index. It listsall of the data collections in PhysioBank, with brief descriptions andlinks to longer descriptions of each.

If you are looking for records with specific combinations of signals,durations, time or amplitude resolution, annotations of specific types,or female or male subjects of particular ages, try aPhysioBank Record Search to locaterelevant data. A limited amount of information about diagnoses and medicationsis also searchable in this way. A tutorial introduction to this tool isavailable here.

A PhysioNet (text) search can also be helpful. Using the search box at thetop of almost any page on this web site, look for keywords that describe thedata you seek.

Where can I find data for healthy subjects?

Most data in PhysioBank have been obtained from subjects with a variety ofhealth problems. About twenty PhysioBank databases, however, include healthysubjects.

The control records (c01, c02, ... c10) from theApnea-ECG Database wereobtained from healthy volunteers during sleep; the recordingseach contain a single ECG signal and are each about 8 hours long.Simultaneously recorded respiration and oxygen saturation signals are availablefor one of these recordings.

The CAP Sleep Databaseincludes 16 full-length polysomnograms of healthy subjects. Signals include3 or more EEG channels, 2 EOG channels, submentalis and bilateral anteriortibial EMG, airflow, abdominal and thoracic respiratory effort, SaO2, and ECG.

The Fantasia Database is avery well-controlled set of 2-hour recordings of ECG (with beat annotations)and respiration signals from 40 rigorously-screened healthy subjects (20 young,20 elderly, with equal numbers of men and women in each group). Half of therecordings also include an uncalibrated continuous non-invasive blood pressuresignal.

Heart rate time series from five additional groups of healthy volunteersare available in a collection of data used for a study of exaggerated heart rateoscillations during meditation (two groups of meditators recorded beforeand during meditation, a group of volunteers recorded during sleep, a group ofvolunteers recorded during metronomic (fixed-rate) breathing, and a group ofelite athletes recorded during sleeping hours).

The PTB Diagnostic ECGDatabase includes records from 52 healthy volunteers; here is alist of them.

The MIT-BIH Normal Sinus RhythmDatabase consists of ECG recordings from subjects who were found to havehad no significant arrhythmias, ST changes, or known cardiac disease. Sincethese subjects were recorded for medical reasons, however, they are notnecessarily “healthy” — but their medical problems are not heart-related.Subjects included in the Normal SinusRhythm RR Interval Database were known to be healthy, however.

The Sleep-EDF Database[Expanded] contains EEG, EOG, and other signals from 42 healthysubjects. (Twenty-two of these had mild difficulty falling asleep, but wereotherwise healthy.)

ECG, EMG, GSR, and respiration from seventeen healthy volunteers areincluded in data collected for a study ofStress Recognition in AutomobileDrivers.

All six of PhysioBank’s gaitand balance databases include at least some data collected fromhealthy volunteers. Among PhysioBank’s neuro- and myoelectric databases, several include data from healthyvolunteers.

This is not a comprehensive list; depending on your interests, you may findother relevant data in PhysioBank. Read the descriptions of the datacollections in the PhysioBank Archive Indexto learn about them, and follow the links there and above for additionalinformation.

This list also does not include data sets that are in development withinprojects on PhysioNetWorks. These data sets are currently accessible tomembers of the respective projects only. When they are complete, they willbecome open-access data within PhysioBank. To learn about them, joinPhysioNetWorks (it’s free, and it takes only a minute or two) and browse through the list of works in progress. Many project owners welcome otherinterested researchers to join their projects, so in some cases it may bepossible to get access while development is still in progress.

Where can I find serial data (multiple recordings of the same subjects?)

These databases include multiple recordings of some or all subjects:

A few other PhysioBank databases include multiple recordings of a few subjects,but lack information about the sequence of the recordings and the intervalsbetween them:

Studies requiring data collected at different times of the day, or during sleepand non-sleep, etc., may also be able to use segments of long continuousrecordings (see the next question).

Where can I find long-duration signals and time series?

Many PhysioBank databases include at least some records that are on the orderof 24 hours or longer in duration. These include:

Is the AHA Database available on PhysioNet?

No, it is currently available only fromECRI. Additionalinformation about the AHA Database is availablehere.

A single sample record that wasprepared as an example by the creators of the AHA Database is available inPhysioBank.

Are there any 12-lead (diagnostic) ECGs in PhysioBank?

The PTB Diagnostic ECG Databasecontains 549 twelve-lead ECGs from 294 subjects. Most of these ECGs are twominutes in duration. (They also include simultaneously recorded Frank XYZleads.)

The St.-Petersburg Institute of CardiologicalTechnics 12-lead Arrhythmia Database contains 75 twelve-lead ECGs from32 subjects. Each recording is 30 minutes in duration.

The PhysioNet/Computing in Cardiology Challenge2011 addressed the problem of quality assessment of 12-lead ECGs, makinguse of 1500 twelve-lead ECGs(a training set of1000 ECGs, and a test setof 500 ECGs). These ECGs are unscored, although those in the training set havebeen classified individually with respect to acceptability for purposes ofdiagnostic interpretation.

Twelve-lead ECGs are also available from sources other than PhysioBank,including ECGWave-Maven, the CSE Databaseand the 12-lead ECG Library.

Are updates for CD-ROM databases of physiologic signals available here?

Yes. Find them here.

Where is [some file]?
I can’t find [something]!

The search box is your friend. It’s at the top right corner of nearlyevery single page on PhysioNet. Use the search box!

Downloading

How can I download binary files?

The details of doing this depend on your web browser, not on anythingspecific to PhysioNet or to the specific files you wish to download, so thefirst thing you should do is to learn how to use your web browser. Mostbrowsers have a Help button that can get you started.

In Firefox or Chrome, right-click on the link, and choose “Save Link As...”from the popup menu.

In Lynx, press d to download the target of the highlighted link.

If you are using MS Internet Explorer, it is often possible to download afile simply by left-clicking on the link to that file. This is not a foolproofmethod, however, since MSIE attempts to guess the file type and may attempt toopen the file rather than downloading it. A more reliable method is toright-click on the link and then to choose Save Target As... from thepop-up menu that appears. In most cases, you can accept the suggested filename, but be aware that MSIE will generate a .txt extension for anyfile that has a name without an extension (such as the files namedMakefile that are found throughout PhysioToolkit), so you will need tocorrect these file names.

In Safari, right-click (or, with a single-button mouse, press the Control keyand click), and choose “Download Linked File”.

Many other web browsers, including Galeon, Konqueror, Mozilla, Netscape,and Opera, allow you to download a file by pressing and holding the Shift keywhile left-clicking on the link to the file you wish to download.

Can I download an entire PhysioBank database in one step?

Yes. Before you do so, however, note that this may not be necessary.

The recommended way to read PhysioBank data files is by using eitherPhysioToolkit software linked to the WFDB library, or (for those who like towrite their own code) your own software linked to the WFDB library. In eithercase, the WFDB library does the work of finding and reading PhysioBank files.If you have a local copy of a PhysioBank file, the WFDB library reads thatcopy; otherwise, it reads the file from PhysioNet using the same HTTP protocolthat your web browser uses.

If you want to read PhysioBank files without using the WFDB library (butwhy?) you will probably need to reformat the files into some lessstorage-efficient format first, and to do that you will need to read theoriginal files using the WFDB library. In that case, you may as well allow theWFDB library to read the original files via HTTP, and write only thereformatted files to local storage.

If you decide to download a local copy of anentire database, there are two ways to do so that are much more efficient thandownloading the files one at a time using a web browser.

The first method uses rsync, which is the same free software used bythe PhysioNet mirrors. Install rsync ifyou don’t have it already, and then use the command

to get a list of databases available via rsync. The outputof this command will contain lines such as

The entries in the first column are names of available “modules” (sets offiles). To download (for example) the AF Termination Challenge Databaseinto a subdirectory of its own within /usr/database, type:

(You may, of course, use any directory for storage of the downloadedfiles. The suggested directory, /usr/database, is searched bydefault by the WFDB library, so it’s a good choice.)

To download the MIMIC II Waveform Database Matched Subset, therecommended procedure is slightly different; seethese notes.

Using rsync is particularly convenient if you have anunreliable connection; if the transfer is interrupted, simply repeatthe command once the connection has been re-established,and rsync will quickly determine where it needs to resume thedownload. Another advantage of using rsync is that itpreserves the timestamps of the original files on PhysioNet, so thatif you return to PhysioNet, it will be easy to see if the originalfiles have been updated since the last time you downloaded them. Ifthere have been any updates, you can bring your local copy up-to-dateby running the same rsync command that you used to create it,without copying anything that hasn’t changed.

Note that rsync has its own IANA-assigned port (873); if you can reachPhysioNet with your web browser (port 80, HTTP) but not via rsync,your firewall may be blocking port 873.

The second method is by using the the dldatabase function from the wfdb-python package. You can see the code repository and documentation here

There is another method described in the answer to the next question.You can choose it if you can’t use rsync or the wfdb-python package, or if your connection toone of the PhysioNet mirrors (which do not generally supportrsync access) is much better than your connection tothe PhysioNet master server.

There are so many files in .... Can I get a zip file or a tar archive ofit?

You can obtain a tar archive or zip file of any single PhysioBankrecord using the PhysioBank ATM.

If you would like to download an entire PhysioBank database, see theprevious question.

Otherwise, you can try looking for a .zip or a .tar.gz archive inthe directory that contains the files of interest, or in its parentdirectory. If you don’t find one, however, the answer is no. It isnecessary to keep individual files available, and maintainingredundant copies of these files within archives would not be the bestuse of available resources.

There are excellent alternatives, however, to downloading many files one ata time using a web browser. Try using a utility that can do batch-orientedHTTP transfers, such as wget, available from this sitein source form for Unix, Mac OS/X, or MS-Windows, or in binary form forMS-Windows. Once you have installed wget, retrieve a batch of filesusing a command such as

(or substitute the name of a nearby PhysioNet mirror forphysionet.org above).

Asking All Them Questions On Youtube

How can I unpack a .tar.gz archive (a “tarball”)?

These files are gzip-compressed tar archives.

MS Windows: The free 7-zip file archiver can unpack .tar.gz archives as well asmost other common compressed formats.

Alternatively, if you have installed Cygwin, follow the instructions below for using GNUtar.

GNU/Linux, Mac OS X:Using GNU tar, you can decompress and unpackfoo.tar.gz in one step:

If your browser decompressed the archive while downloading it, just unpack it:

Other Unix platforms:Traditional versions of tar may not support GNU tar’sz option. If you have one of these, decompress using gzipbefore unpacking, and then unpack the decompressed archive, like this:

(If you don’t have gzip, free versions are available for allpopular operating systems from gzip.org.)

I unpacked the tarball, now where are the files?

An archive named foo.tar.gz would normallybe unpacked into a subdirectory (folder)named foo within the current directory(folder). Look at the file names shown during the unpacking process to seewhere the unpacked files have been written.

Can I get these files via FTP?

No. It is considered insecure for an FTP server and a web server toshare a file system, and it is not practical for us to maintain separateweb and FTP servers.

If you are interested in batch file transfers, read the answer to'There are so many files in ....', above.

Can I look at the waveforms using only my web browser?

Yes. Go to the PhysioBank ATM and fillin the form. For a sample, clickhere.

PhysioBank Files

What are PhysioBank-compatible (or WFDB-compatible)formats?

The contents of almost all PhysioNet databases are collections of flat files(not relational databases). These files can be read by programs that use theWFDB library to do so. The WFDBlibrary reads files in a variety of formats, presenting their contents ina uniform manner to the programs that use it, so that those programs need notbe concerned with the details of the storage formats used in each case. Theformats that can be read by the WFDB library are referred to as “PhysioBank-compatible formats”, because they are permissible for files within standardPhysioBank databases. The terms “PhysioBank-compatible' and “WFDB-compatible'are synonymous. Note, however, that the WFDB library is capable of reading awider variety of formats than those that are actually used within PhysioBank.

Many visitors who ask this question assume that they need tounderstand the details of PhysioBank’s file formats in order to usePhysioBank. This is not necessary, however. Numerous options existfor reading and writing files in PhysioBank-compatible formats; readthe other questions and answers in this section of the PhysioNet FAQfor pointers to many of them. If you really need to know the detailsof the formats, however, follow the pointers in the next paragraph.

There are several types of files in standard PhysioNet databases:

  • Signal filesare binary files containing samples of digitized signals.
  • Header files areshort text files that describe the contents of associated signal files.
  • Annotationfiles are binary filescontaining annotations (labels thatgenerally refer to specific samples in associated signal files).
  • RECORDSand ANNOTATORS are text files listing therecords belonging to thedatabase, and the types of annotation files available for thedatabase. (Each database has its ownRECORDS file, and each annotated database has itsown ANNOTATORS file.)
  • EDF filesare included in some PhysioBank databases in lieu of separate signal andheader files. Since recent versions of the WFDB library can read themdirectly, EDF is a PhysioBank-compatible format.
  • EDF+ files are EDF files that also contain annotationsencoded as signals. Although the WFDB library can read the signalsfrom an EDF+ file, it does not support decoding the annotations, sothis format is only “mostly PhysioBank-compatible”. The PhysioToolkitapplication rdedfann can decode these annotations into text that can be easily converted intoPhysioBank-compatible annotation files usingwrann.
  • Calibration files aretext files that define customary scales for each type of signal.(There is a default calibration filecontaining definitions for most signals appearing in PhysioBank, so moststandard PhysioBank databases do not need to have their own calibration files.)

What are .dat, .hea, .atr, .qrs, ... files?

Files belonging to PhysioBank databases have two-part names: the firstpart is the record name, and the second part(following the “.”) indicates the file type. For example, a file named“chf08.hea” is a file of type .hea (see below) belongingto a record named “chf08”.

All of these file types are found in PhysioBank databases:

  • .dat files are binary signal files. See the questions and answers below, beginning with “What is the format of the signal files?”, for information about their format and how to read them.
  • .hea files are short text “header” files used by all of the software that reads the signal files to determine their location and format. In some cases, .hea files also contain structured comments that include information about the subjects (e.g., age, gender, medications, diagnoses).
  • .atr and .qrs files (and other files described in the database index pages as annotation files) are binary files containing labels (annotations) that point to specific locations within the signal (.dat) files and describe events at those locations. For example, many of these annotations indicate the times of occurrence and types of individual heart beats in records containing ECG signals. See the questions and answers below, beginning with “What is the format of the annotation files?”, for information about the format of these files and how to read them.

What are .xws files and how can I view them?

These are short text files that point to specific locations within the recordswith which they are associated. You can view the same locations using, forexample, the PhysioBank ATM. If youhaven’t set up a browser helper application for viewing .xws files,you can read them as text and copy the database, record, and annotator intothe PhysioBank ATM, then navigate to the location of interest.

You can set up WAVE(actually, wavescript) as a helper application for yourbrowser so that when you click on a .xws file, WAVE will openthe associated record at the specified location using its built-in HTTPclient code (this is much faster and more flexible than using thePhysioNet ATM). See Controlling WAVE from a webbrowser in the WAVE User’sGuide for details.

What is a “record name” or an “annotator name”?

Records are identified byrecord names, which contain letters, digits, and underscores. Forexample, the MIT-BIH Arrhythmia Database has record names consisting ofthree-digit numbers beginning with ‘1’ or ‘2’, and the European ST-T Databasehas record names that are four-digit numbers prefixed by ‘e’. Case issignificant in record names that contain letters, even in environments such asMS-DOS for which case translation is normally performed by the operating systemon file names; thus ‘e0104’ is the name of a record found in the European ST-TDatabase, whereas ‘E0104’ is not. A record is comprised of several files, whichcontain signals, annotations, and specifications of signal attributes; eachfile belonging to a given record normally includes the record name as part ofits name. The files named RECORDS found in the PhysioBank databasedirectories list the record names for each database.

There may be many annotation files associated with the same record; theyare distinguished by annotator names. The name of an annotation fileis the record name, followed by a ‘.’, followed by the annotator name. Thefiles named ANNOTATORS found in the PhysioBank database directorieslist the annotator names for the annotation files that are available here. Theannotator name ‘atr’ is reserved to identify reference annotation filessupplied by the developers of the databases to document correct beatlabels. You may use other annotator names (which may contain letters, digitsand underscores, as for record names) to identify annotation files that youcreate. You may wish to adopt the convention that the annotator name is thename of the file’s creator (a program or a person).

How can I run ... on all of the records in a PhysioBank database?

Write a shell script to iterate over the records. You can use the(text) file called RECORDS in each database directory (seethe previous question) as the list of records to be processed; wfdbcat can be usedto get this file from PhysioBank. For example:

The example above runs sigamp on each record in theMIT-BIH Arrhythmia Database (mitdb). Use whatever scripting language you wish;the example is written in the standard POSIX sh scripting language andcan be run in a terminal window on GNU/Linux, Mac OS X, or any other UNIX, orin a Cygwin window under MS-Windows. For a tutorial introduction to writingshell scripts, try the three-part Bash by Example series or the more comprehensive Unix/LinuxShell Scripting Tutorial [external links open in another window].

Where are the annotation, signal, or header files I just created?

WFDB applications can read from local files or directly from remotelocations such as the PhysioNet web site, but they always write to local files.In order to read annotation, signal, or header files that you have written,it will usually be simplest to begin within the directory (folder) that wascurrent when they were created.

If you use WFDB applications to create new annotation, signal, orheader files, those files are created within the current workingdirectory (or, in some cases, its subdirectories). Thus, for example,the output annotation file created by the command

is a file in the current directory named 100s.wqrs. If therecord name contains additional path information, the output file iswritten in a location accessible by following that path from thecurrent directory. For example, the command

writes its output annotation file (100.wqrs) in the mitdbsubdirectory of the current working directory. If mitdb doesn’texist in the current directory, wqrs creates it.

Applications that use the WFDB library behave this way so that their outputfiles can be located by other WFDB applications. For example, given theabove, the command

displays the annotations from the local file created by wqrs togetherwith the corresponding signals from PhysioBank. Neither wqrs norwave need to read local copies of the header or the (muchlarger) signal files, however. If no local copies exist, they areread directly from the PhysioNet server, using the additional pathinformation in the record name (mitdb/, in this example) to find them.

How were the signals in PhysioBank digitized?

They come from many sources, but in all cases the signals have been digitallyrecorded or digitized from analog recordings. See the descriptions of theindividual databases for details.

We are occasionally asked about digitizing paper ECGs and other hard-copy data.A brief survey of this subject is available here.

Should I use PhysioBank formats for my project?

Perhaps we’re biased, but we think so, and here’s why:

  • There is a large amount of free and open-source software that reads and writes data in these formats. (The WFDB software package is a collection of such software, including viewers, signal processing and analysis applications, and an I/O library that can be used to build custom applications that read and write these formats.) If you use PhysioBank formats, you can use all of this software as is.
  • These formats are reasonably storage efficient, while still permitting efficient random access. Since recordings can be of arbitrary duration (some of those in PhysioBank are up to 40 days in length), it is worthwhile to store them efficiently, not only to reduce disk requirements but also to reduce the time needed to transmit them and to read them. It is also worthwhile to be able to read only a segment of interest from somewhere in the middle of a long record, without having to read data sequentially from the beginning. These binary formats are more compact than text-based formats such as FDA XML, and less so than variable-length compressed formats such as SCP ECG; they represent a reasonable compromise in terms of storage efficiency to achieve a significant advantage in use efficiency.
  • These formats, when used to store multiple simultaneous signals, have the advantage over EDF that it is unnecessary to skip around in the record in order to assemble a vector of simultaneous samples of signals (a very common and basic operation in signal processing of multiple signals).

    If you can read an entire EDF frame (often a minute of digitized signals) into memory, and if a latency on the order of the frame length is tolerable, then EDF is a good choice, also.

  • Although it has been argued that meta-information (signal descriptions, sampling frequency, gain, etc.) should be kept in the same file as the digitized signals, there are advantages to keeping this information in separate “header” (.hea) files as is done on PhysioNet:
    • A large number of (very small) header files can be kept in one place (to make it easy to find a record), and the digitized signals in their (possibly very large) signal files can be kept in many locations (not necessarily the same directory, the same disk, or even the same file server).
    • When recording signals, we often do not have a priori knowledge of details such as the length of the recording or the gains of the signal amplifiers (although we might record analog calibration signals so that gains can be measured later on). This information can be added when available to separate header files, without needing to rewrite the possibly huge associated signal files.
    • Occasionally we assemble records from multiple instruments, or by combining recorded signals with additional (computed or synthesized) signals. In this case we can keep signal sets in separate, parallel signal files, and add their metadata to the header file without rewriting the signals.

    Of course, it is possible to embed metadata in a signal file in PhysioBank format if this is desired; we don’t generally do so, for these reasons.

  • There is an enormous amount of data already available in these formats (from PhysioBank and from other sources), so most users of physiologic signal data are already using these formats. For many researchers, the first step in using records in some other format is to convert them to a PhysioBank format, so that they can be analyzed using familiar tools.

Some PhysioBank formats that are good choices for new projects are format 16(very easy to read using almost any software, though not as storage efficientas format 212 if you have 12-bit or lower resolution) and format 212 (most usedin PhysioBank, because it’s ideal for widely available 12-bit resolutiondata, and it’s still relatively easy to read). Format 8 is notrecommended for new projects, because it does not preserve the DC offsetwhen used in random-access mode, and because it limits the maximum slew rateof the signals that can be recorded. (It is supported for historical reasons,and it was devised as a way to circumvent memory and storage limits encounteredin recording the MIT-BIH ArrhythmiaDatabase.)

Why you asking all them questions mp3 download

If you need even more storage efficiency than is provided by PhysioBankformats, consider using gzipor bzip2 to compress filesstored in format 16 or 212, or (especially for a commercial product) considerSCP ECG.

If you need an easy-to-read format, and efficiency is not a concern, userdsamp’s output (text)format (see this note).

Asking All These Questions

Reading and Writing Digitized Signals

How can I find out what signals were recorded?

If you are looking for recordings with a specific type of signal, look firstin the PhysioBank Archive Index, whichindicates in general terms the types of signals in each of the availabledatabases.

Within each database, there may be variations in the choice of signals fromrecord to record. The pages that describe each database (see the links fromthe PhysioBank Archive Index) can help in locating subsets of records thatcontain specific signals of interest.

Each recording has an associated header file thatlists (among other things) the names of the signals included in that recording.The wfdbdesc utilityreads header files and produces a readable summary of their contents, includingthe signal names. Many other PhysioToolkit applications that read PhysioBankdata are capable of printing or displaying signal names. ThePhysioBank ATM shows the names of the signalsbelonging to the selected record (in the drop-down Signals list).

What do the signal names MLII, V2, ... mean?

Short answer: MLII and V2 are ECG signals. The names refer to theelectrode positions, using standard nomenclature for lead names. MLII is'modified lead II', a bipolar lead parallel to the standard limb lead II, butacquired using electrodes placed on the torso (a requirement for long-term ECGmonitoring). V2 is a precordial lead that is roughly orthogonal to MLII.These two leads are favored for many recordings, since MLII yieldshigh-amplitude normal QRS complexes in most subjects, and V2 usually offers anearly optimal frontal-plane projection of any ectopic beats that happen to beof low amplitude in MLII.

Long answer: Signals in PhysioBank databases have standardized names(see the previous question). Most of these are in common clinical use fordesignating signals such as arterial blood pressure (ABP), respiration (RESP),or heart rate (HR). Generally the pages that describe the databases in whichthese signals appear include definitions of any unusual signal names (see thelinks to these pages from the PhysioBankArchive Index).

Most ECG recordings contain two or more simultaneously recorded ECG signals,called 'leads.' Since the heart generates an electrical field thatvaries spatially as well as temporally, there is no uniquely determined(scalar) signal that offers a complete view of cardiac electrical activity.The standard practice among clinicians and researchers interested in the ECG isto record two or more signals (leads) derived using sensing electrodes placedat certain specific locations. Some leads are bipolar (they are potentialdifferences between pairs of electrodes); others are unipolar (they arepotentials measured with respect to an artificial 'zero' referencepotential typically derived by summing potentials measured at multiplelocations). Confusingly, the wires that connect the electrodes to therecording equipment are also called 'leads'; thus, for example, afive-lead (five-wire) harness is generally used to record a two-lead(two-signal) ECG!

The three Einthoven bipolar limb leads (designated I, II, and III) aredetermined by the pairwise potential differences between electrodes placed onthe left arm (LA), right arm (RA), and left leg (LL); specifically, lead II isthe potential difference between LL and RA. In most subjects, the axis definedby these points is roughly parallel to the mean cardiac electrical axis, so itis a lead in which QRS complexes are typically observable at nearly maximumamplitude.

In long-term ECG recordings (including most of those on PhysioNet), limbleads are not generally used, since physical activity causes significantinterference in these leads. Commonly, equivalent 'modified' leadsare used, with electrodes placed on the torso in positions chosen so thatthe signals closely match the limb leads. This is possible because thecardiac electrical field is (to a good approximation) a time-varying dipolefield, so that it is generally sufficient to choose positions that allow oneto observe the same projections of the dipole field onto the axes defined bythe limb leads.

For MLII (modified lead II), the LL equivalent electrode is ideally placed atthe left iliac crest, and the RA equivalent electrode is ideally placed in theinfraclavicular fossa, medial to the border of the deltoid muscle and 2 cmbelow the lower border of the clavicle.

Additional information about ECG lead systems can be found in many textbooksabout electrocardiography. A clear and comprehensive discussion can be foundin chapter 15 of Bioelectromagnetism: Principles and Applications ofBioelectric and Biomagnetic Fields by J Malivuo and R Plonsey(Oxford University Press, 1995; also available on-linehere).

What do the signal names ‘signal 0’, ‘signal 1’, ... mean?

As noted in the answers to the previous two questions, signal namesrecorded in header files describe the signals in eachrecord. In rare cases, however, this information is missing, usually becauseit was not preserved when the signals were recorded. Names of the form ‘signalN’ are default signal names used in such cases; they may appear inheader files explicitly, or they may be displayed by PhysioToolkit software ifno explicit signal names appear in a header file.

What is the format of the signal files?

Many formats are supported. Most signal files are written in “format 212”,in which two 12-bit samples are bit-packed into three 8-bit bytes, or in'format 16', in which a 16-bit sample is written as two bytes, least significantbyte first ('little-endian'). Seesignal(5) in theWFDB Applications Guide for detailson formats 16 and 212 and on the other supported formats.

To determine which format is used for a given signal file, look in theassociated header file. (This is a text file that usually has the same name asthe signal file, except for a suffix of .hea instead of.dat.) Each line of the header file that begins with the name of thesignal file describes the format and contents of a signal within the signalfile. See header(5) in the WFDBApplications Guide for details.

How can I read signal files?

If you would like to read signal files within a C, C++, Fortran, Java,Perl, or Python program, see the WFDB Programmer’sGuide for information on doing this using the WFDB library. Otherprogramming languages supported by SWIG may also be usable with the WFDB library, but havenot been tested. Briefly, use isigopen() to open the files, andgetvec() to read them.

If you would like to do this within a Matlab or Octave program, werecommend usingthe WFDB Toolbox forMATLAB. For an overview of this solution and a variety ofalternatives of varying degrees of complexity,see Reading and writing PhysioBankand compatible data onthe Contributed software for Matlab andOctave page. Note that Matlab and Octave are not able to import mostsignal files directly; for exceptions, seewfdb2mat.

Another possibility is to convert the portions of interest intotext formatusing rdsamp(described in detail in Howto obtain PhysioBank data in text form). Tosave rdsamp’s output in a file, or to read rdsamp’soutput using another program, see this note.Alternatively, segments of up to 100,000 samples in length of signal files foundon this web server can be converted into text using thePhysioBank ATM, which can be accessed usingyour web browser. This may be useful if you wish to read signalsusing Excel or another spreadsheet (although spreadsheets in generalare not recommended as tools for signal processing, visualization, oranalysis; there are much better choices freely available inPhysioToolkit).

MS-Windows Media Player and similar software for reading audioand multimedia files cannot be used to read these files (or any others inPhysioBank).

How can I use Matlab’s import feature to read signal files?

You can’t in general, because Matlab doesn’t know how to figure out which ofthe many supported formats is used in any given signal file, because it can’tunderstand the most commonly used formats in any case, and because (in manycases) signal files are orders of magnitude larger than any matrix that Matlabcan handle.

You can export a segment of signal files up to a million samples inlength as a .mat file readable by Matlab or Octave, using thePhysioBank ATM. Longer segments may bedifficult to handle, but you can make them if you wish usingwfdb2mat, included in theWFDB software package and used bythe PhysioBank ATM. The .mat files produced in this way canbe read and plotted usingplotATM.m.

See How can I read signal files? for avariety of other ways to read signal files from Matlab without using its importfeature.

Why does Matlab say “file might be corrupt” when loading a huge .mat file?

Matlab cannot handle version 4 .mat files containing morethan 100,000,000 samples. If you need to load a file this large (andyou have enough memory to do so), use Octave instead.

In most cases, however, a better strategy is to redesign your programso that it does not need to read the entire record into memory atonce.

Is there any direct way of converting sample values to physical units using wfdb2mat?

The program plotATM.m readsthe output of wfdb2mat and converts the raw samples into physicalunits. The conversion is very simple and easily incorporated in your ownMatlab or Octave code if you prefer for whatever reason not touse plotATM.m; see plotATM.m for details.

wfdb2mat doesn’t do this conversion itself since this would(1) increase the the size of the generated .mat files by afactor of 4 or 8 (depending on the ADC resolution), (2) slow down andsignificantly complicate wfdb2mat, because of the need toconvert from native (IEEE 754 on most platforms) floating-point formatto the VAX floating-point format required by Matlab, (3) makethe .mat files incompatible with WFDB applications, and (4)make it unnecessarily difficult to distinguish the effects ofquantization error from other sources of noise in the signal, forthose who might wish to do so.

How can I use Excel’s import feature to read signal files?

You can’t, because Excel doesn’t know how to figure out which of the manysupported formats is used in any given signal file, because it can’t understandany of those formats in any case, and because signal files are almost alwaysorders of magnitude larger than any spreadsheet that Excel can handle.Spreadsheets are not suitable for studying, visualizing, or analyzing digitizedsignals; many better tools are freely available inPhysioToolkit.

If, despite the above, you still wish to read a piece of a signal file intoa spreadsheet, see How can I read signal files?.

Where do I get rdsamp?

It’s part of the free, open sourceWFDB Software Package. Both C sourcesand binaries for several popular operating systems are available.

How do I use rdsamp?

Install it, then type

for a brief summary of options. For details, seerdsamp(1) in theWFDB Applications Guide.

The output of rdsamp is in text format. Unless you have used the-v option, the output contains data only (no column labels) and can beplotted directly using, for example,plt. The first column containssample numbers (or elapsed times in seconds and milliseconds if you have usedthe -p option), and each of the remaining columns contains samplesfor one signal (in raw ADC units unless you have used the -p option).

To save rdsamp’s output in a file, or to read rdsamp’s outputusing another program, see this note.

What do the sample values represent?

Analog-to-digital converters (ADCs) are usually used to produce PhysioBanksignal files, which consist of sequences of integer samples in unscaledanalog-to-digital converter units (adus). Samples are stored in thisway not only because doing so usually requires less space than mostalternatives, but also because this scheme introduces no loss of precisionbeyond the quantization error of the ADC. By default, rdsamp outputssample values in unscaled adus (raw ADC units).

The header file for each recordcontains fields that describe the characteristics of each signal and of the ADCused to digitize it. These fields include the signal type (such as ECG, ABP,or SpO2), the physical units of the original analog signal (such as mV, mmHg,or degreesC), the gain (the number of adus per physical unit), thebaseline (the sample value that would correspond to a physical valueof zero, which is often but not always at the center of the ADC range, and mayeven lie outside of the ADC range), the adczero (the sample value atthe center of the ADC range, which is 0 for a bipolar ADC and a non-zero valuefor an offset binary ADC), and the number of bits of ADC precision (most often12 for PhysioBank recordings). Taken together, the values specified for theseparameters allow identification of the signals, conversion of sample valuesfrom raw ADC units to baseline-corrected physical units and back again, andcalculation of the ADC range in raw or physical units.

Using rdsamp’s -p (physical unit output) option, or using thePhysioBank ATM (which uses rdsamp), thesample values are presented in baseline-corrected physical units. The signaltypes and units used appear in the first two lines of rdsamp ’s outputwhen using rdsamp’s -v (verbose output) option. Note that inthese cases, the sample values are given to exactly three decimal places bydefault regardless of the precision of the integer samples, although additionalprecision can be obtained using rdsamp’s -P option.

The database of Evoked AuditoryResponses in Normals across Stimulus Level is the first (and so far, theonly) PhysioBank database containing 24-bit and 32-bit signals. All othercurrent PhysioBank databases were recorded using ADCs with resolutions of 16bits or fewer. A raw sample value of -32768 has a special meaning: itsignifies that no valid observation of the associated signal was made duringthe corresponding sampling interval. This value is the most negative numberthat can be represented in 16 bits, so (in data with 16 or fewer bits ofADC resolution) it is less than any valid sample value.

What does the message “init: can’t open header for ...” mean?

This message can be produced by any application linked to the WFDB library,including rdsamp and rdann. In order to read data files,these applications need to find a header (.hea) file for the inputrecord you specify. The message indicates that the header file was not foundin any of the expected places, or that it was unreadable. There are threecommon reasons why this can happen:

  • The record name supplied to the application is not correct. Note that record names are not file names; if you wish to read, for example, a signal file named slp60.dat using rdsamp, you must specify the name of the record to which this file belongs (slp60) after the -r option, and not the name of the file itself. Whatever follows “init: can’t open header for ...” is what the application thinks is the name of the record you wish to read. Also, be aware that case matters in record names, even under operating systems that ignore case in file names. Thus “SLP60” is not a valid record name; “slp60” is.
  • The header file is missing. If you download signal (.dat) or annotation (.atr, .qrs, etc.) files, be sure to download the corresponding .hea files from the same locations.
  • The list of locations to be searched does not include the location of the header file. WFDB applications find their input files by searching a list of locations specified by the WFDB path (the environment variable WFDB, or a default list of locations if WFDB has not been set). The WFDB path normally includes the current directory, but this may not be true if the WFDB path has been modified; the current directory must appear explicitly (either as a “.” or as an empty component in the path) in order to be included in the list of locations to be searched. For further information, see “The Database Path and Other Environment Variables” in the WFDB Programmer’s Guide.

I can’t run rdsamp. Can you please send me a copy of ... in textformat?

Yes. Go to the PhysioBank ATM,request the data of interest, and save the results in your browser.

How can I get more than 100,000 samples?

The PhysioBank ATM “Show samples as text” and'Export signals as CSV' tools are intended to offer short segments of data intext form without the need for anything more than a web browser. They are notintended to be methods for obtaining large amounts of data.

The ATM’s “Export signals as .mat” tool also limits the amount of dataconverted, but since .mat format is significantly more compact thantext or CSV formats, the limit is 1,000,000 samples per signal. Largeramounts may be difficult to load into Matlab.

You may download an entire record in EDF or PhysioNet (binary) formatsusing the ATM’s “Export signals as EDF”, “Make tarball of record”, or“Make zip file of record” tools (or by simply downloading the files you needfrom the PhysioBank archive with your web browser).

If you need more data in text, CSV, or .mat format thanallowed by the ATM’s tools, first convert a short segment in thedesired format, then read the notes immediately below the ATM’scontrol panel to learn how to run the ATM’s format conversionapplications (rdsamp and wfdb2mat) on your owncomputer, without limitations on the length of the output. If youinstall theWFDB Software Package and runrdsamp on your own computer, for example, you can convert entirerecords to text if you wish.

The PhysioBank ATM limits you to 100,000 samples in text or CSVformats at a time, because signal files converted to text can be verylarge, and reading them would not be possible with standard webbrowsers. If you do not wish to use any of the alternatives above, you mayconcatenate successive segments obtained by multiple requests to theATM; this will allow you to obtain the data you wish withoutsignificantly affecting other PhysioNet users or crashing your webbrowser.

How can I create a PhysioBank-compatible record from my own data?

There are many ways to create a PhysioBank-compatible record. Here is an easyway to do so:

  1. If your signals are still in analog format, digitize them. For ECGs, we recommend using a sampling frequency of at least 120 Hz, with at least 8 bit resolution over a ±5 mV range (ideally, 250 Hz to 1 KHz, with 12 bit or higher resolution over a ±10 mV range). As is necessary whenever digitizing any signal, use an appropriate antialiasing filter (a low-pass filter with a cutoff no higher than about 40% of the sampling frequency).
  2. Write the samples into a file in text form, as a column of decimal numbers. If you have digitized more than one signal, use a separate column for each signal. (The software that was used to digitize your signals may include a means for doing this.)
  3. Read about wrsamp to see how to prepare a binary signal file and a header file from the text file. Typically, you will need to use a command such as

    This example reads a text file named data.txt, and creates the files needed for a record named Rec01, namely, a signal file named Rec01.dat and a header file named Rec01.hea. (See What are .dat, .hea, .atr, .qrs, ... files? for definitions of signal and header files.) The arguments of -F and -G specify that the signal was sampled at 128 Hz and that the signal was amplified in such a way that a step of 1 millivolt would appear as sample values that differ by 102.4 units. The final argument (0) indicates that the leftmost column in the input (column 0) contains the data.

Records that belong to PhysioBank never have names that include upper-casecharacters, so you may wish to follow the example above and include at leastone upper-case character in the names of any records you create, to avoidany possibility of confusing them with PhysioBank records.

There are shortcuts that may be useful if your data happen to be in a formatfor which a converter is available:

  • If you have EDFfiles, use edf2mit to convert themto PhysioBank-compatible format.
  • If you have AHA Database files (orothers in AHA format), use a2m and ad2m to convert them toPhysioBank-compatible format.
  • If you have a .wav (audio format) file, it may bePhysioBank-compatible already, but you will need to make aPhysioBank-compatible header (.hea) file for it using wav2mit, so that WFDB applications canread the .wav file directly. (This works for most .wavfiles, but there are many infrequently-used variants of .wav format,and not all of them are compatible. Read on if wav2mit complainsabout your .wav file.)
  • If your signals are in another audio file format, you may be able to use audiofile conversion software such as the freely available SoX converter to create a.wav file first, and then use wav2mit to finish theconversion.

If you wish to annotate your record, see Howcan I annotate a record?

Reading and Writing Annotations

What is an annotation?

Informally, an annotation is a note about some feature of a signal. Onthis web site, an annotation is a tag (label) that 'points'to a specific sample of a digitized recording.

Most PhysioBank databases include annotations for each record. In somecases, these may be reference annotations that have been independentlyreviewed by one or more (usually, two) human experts; in others, they may bemachine annotations generated by automated signal-processing andanalysis software. The documentation for each database indicates what types ofannotations are available.

Usually, annotations mark events that are localized in time (such asindividual heart beats); sometimes, they are used to indicate persistentattributes (such as the beginning of a period of sleep). In recordings thatcontain two or more simultaneously recorded signals, an annotation can'point' to all signals at once, or to a specific signal.

Each annotation can be thought of as an object having six attributes: thetime (the number of sample intervals that precede the sample that theannotation marks); an annotation type (anntyp [sic], usually displayedas a mnemonic annotation code; see the next question); three numericattributes (subtyp [sic], chan, and num); and anoptional string (the aux string). Only the time attributehas a fixed meaning; all of the others can be redefined to fit thecharacteristics of the data and the needs of the investigator.

Annotations are kept in files that exist independently of the signals thatthey annotate; this means, among other things, that multiple sets ofannotations (created by different applications or people) can coexist, and thatannotations can be read even if the signals to which they refer are notavailable.

Within an annotation file, annotations are stored in a compact binaryformat. See the questions and answers below for information about readingannotation files.

What do the annotation codes (N, V, S, F, ...) mean?

WFDB applications such as those used bythe PhysioBank ATM display annotationsusing these and other codes. When these codes are used to annotateECGs, N is a normal sinus beat, V is a ventricularectopic beat, S is a supraventricular ectopic beat,and F is a fusion of a normal beat and a ventricular ectopicbeat. These and many others aredescribed here.

What is the format of the annotation files?

Most annotations occupy two bytes, of which 10 bits contain the timeinterval (in units of sample intervals) from the previous annotation, and 6contain an annotation type code. Special type codes allow for annotations atintervals that exceed 1023 sample intervals, and for other numeric and textfields to be associated with individual annotations. Seeannot(5) in theWFDB Applications Guide for details.

How can I read annotation files?

If you would like to read annotation files within a C, C++, Fortran, Java,Perl, or Python program, see the WFDB Programmer’sGuide. Other programming languages supported by SWIG may also be usable with theWFDB library, but have not been tested. Briefly, use annopen() toopen the files, and getann() to read annotations from them.

If you would like to do this within a Matlab or Octave program, werecommend usingthe WFDB Toolbox forMATLAB. For an overview of this solution and a variety ofalternatives of varying degrees of complexity,see Reading and writing PhysioBankand compatible data onthe Contributed software for Matlab andOctave page. Note that Matlab and Octave are not able to importannotation files directly.

Another possibility is to convert the portions of interest into text formatusing rdann(described in detail in Howto obtain PhysioBank data in text form). rdann can bedownloaded as part of the WFDB SoftwarePackage and run on your own computer. To save rdann’s output in afile, or to read rdann’s output using another program, see this note. Alternatively, annotation files found on this webserver can be converted into text using thePhysioBank ATM, which can be accessed using your webbrowser. The format of rdann’s text output is described below.

What does the error “annopen: can’t read annotator ... for record ...” mean?

This message can be produced by any application linked to the WFDBlibrary that attempts to read annotation files. In order to do sosuccessfully, these applications need to find the annotation file for theannotator and input record you specify. The message indicates that theannotation file was not found in any of the expected places, or that it wasunreadable. There are several common reasons why this can happen:

  • The record name supplied to the application is not correct. Note that record names are not file names; if you wish to read, for example, an annotation file named 100.atr using rdann, you must specify the name of the record to which this file belongs (100) after the -r option, and not the name of the annotation file itself. Whatever follows “for record ...” in the error message is what the application thinks is the name of the record you wish to read.
  • The annotator name supplied to the application is not correct. Note that annotator names are not file names, either; if you wish to read, for example, an annotation file named 100.atr using rdann, you must specify the annotator name of the file (its suffix, atr) after the -a option, and not the name of the annotation file itself. Whatever follows “annotator ...” in the error message is what the application thinks is the annotator name of the file you wish to read.
  • The annotation file may not be in the WFDB path. Check this using wfdbwhich, as in

    If wfdbwhich cannot find the annotation file, copy or move it into any of the locations in the WFDB path (listed by wfdbwhich), or add the directory containing the annotation file to the WFDB path. For further information, see “The Database Path and Other Environment Variables” in the WFDB Programmer’s Guide.

Where do I get rdann?

It’s part of the free, open sourceWFDB Software Package. Both C sourcesand binaries for several popular operating systems are available.

How do I use rdann?

Install it, then type

for a brief summary of options. For details, seerdann(1) in theWFDB Applications Guide.

The format of rdann’s output is described in the answer to thenext question.

To save rdann’s output in a file, or to read rdann’s outputusing another program, see this note.

What are the columns in rdann’s output?

If you add the -v option at the end of the command line, rdannprints a set of column headings above the first annotation line.

The output contains one annotation per line; from left to right, each linecontains the time of the annotation in hours, minutes, seconds, andmilliseconds; the time of the annotation in sample intervals; a mnemonic forthe annotation type; the annotation subtyp [sic], chan, andnum fields; and the auxiliary information string, if any. Themeanings of the annotation type mnemonics and of the other fields are discussedhere.

For example, if we read the first five seconds of the reference (atr)annotations for record 200 of the MIT-BIH Arrhythmia Database using thecommand

then we obtain this output:

Each of these eight lines contains one annotation. The third column showsthe annotation mnemonics, and by referring to thetable of mnemonics.we can seethat the ‘+’ in the first annotation indicates that it marks theunderlying rhythm of the beats that follow; the rhythm type is ventricularbigeminy, specified by the “(B” that appears in the aux fieldat the end of the line; see thistable for descriptions of rhythm annotation strings such as “(B”.The remaining seven lines each mark a QRS complex, associated with either anormal (N) or premature ventricular (V) beat. Times in the first and secondcolumns indicate when the events marked by the annotations occur. For example,the first V beat occurs 0.625 seconds (625 milliseconds), or 225 sampleintervals, after the beginning of the record. (A quick calculation shows thatone sample interval is 2.777... milliseconds for this record, or that itssampling frequency is 360 Hz. Sample intervals may vary between records.)The subtyp, chan, and num fields in columns 4, 5,and 6 are usually zero in reference annotation files, but occasionally oneor more of these fields is used to indicate additional information, as inthis case, in which the subtyp field in the V annotations indicateswhich of several ventricular ectopic beat morphologies has occurred. Seethe documentation for the associated database to see how to interpret thesefields.

In some cases, the times in the first column may be enclosed in square brackets[like this]. This format indicates that the times are given as times ofday (in the local time zone where the recording was made). Bracketed times mayalso include the date (in DD/MM/YYYY format), if this information is available.If the time of the beginning of the recording is not available, the times inthe first column are not bracketed, and in this case they represent the elapsedtime from the beginning of the recording.

I can’t run rdann. Can you please send me a copy of ... in textformat?

Yes. Go to the PhysioBank ATM, requestthe data of interest, and save the results in your browser.

How can I create an annotation file?
How can I annotate a record?

If the signals you wish to annotate are not already in a PhysioBank-compatibleformat, including .dat and .hea files, follow theinstructions in How can I create aPhysioBank-compatible record from my own data?

If your record contains ECG or blood pressure signals, you may wish to make abeat annotation file. There are severalways to create one using PhysioToolkit software:

  • Use sqrs, a good, fast and simple QRS detector.
  • Use wqrs, a reasonably fast QRS detector that generally works better than sqrs.
  • Use ecgpuwave, a very good QRS detector that also locates the P- and T-waves and their boundaries. Follow the link above to obtain ecgpuwave.
  • Another PhysioToolkit application, wabp, can create a beat annotation file from a blood pressure signal.

You should always review the beat annotation file generated by any of thesedetectors; although all of them work well in most cases, there is widevariability among recordings, and any detector will make errors if the dataquality is insufficient. There are, once again, several ways to do this:

  • Use WAVE, an interactive viewer that permits you to correct QRS detection errors manually if you wish to do so; you can also create annotation files from scratch (completely manually) using WAVE. WAVE is included in the WFDB Software Package and runs under FreeBSD, GNU/Linux, Mac OS X, MS-Windows, and Solaris.
  • Use pschart or psfd. Both produce PostScript output. If you don’t have a PostScript printer, you can view or print the output using GhostScript (a free PostScript-compatible rasterizer; follow the link to obtain GhostScript from its developers).

All four of these detectors mark all detected beats as normal (N). If yourrecord includes abnormal beats, change the N annotations for these beats to thecorrect annotations (a complete list of annotation types can be found here). This can be done manuallyusing WAVE.

Another possibility is to use OSAS, free software for QRS detection and beatclassification available from its author (follow the link for details). Thismay be particularly helpful if your records contain more than a handful ofabnormal beats, since OSAS can find most abnormal beats and annotate themappropriately, but it is still necessary to review the automatically-generatedannotation file and correct any errors.

If you have annotations (or their equivalent) that must be converted intoPhysioBank-compatible annotation file format, it may be easiest to convertthem first into a text format that can be read by rr2ann, which can thenbe used to produce the desired (binary) annotation file. If you wish touse any of the optional annotation attributes (subtyp, chan,num, or aux),rr2ann will not be sufficient. In this case, you may wishto convert your data first into rdann’s output(text) format; this can be read as input by wrann,which will convert the data into PhysioBank-compatible annotation format.If you do this, note that the first column (time in hours, minute, andseconds) must be present but need not be valid, since wrann determinesthe annotation times from the second column (time in sample intervals); notealso that entries in the last column may be omitted for any annotationsthat have an empty aux field. Both rr2ann and wrannread text-format data from their standard input.

Software

I double-clicked on the program icon, and nothing happens!
I typed the program name in the ‘Run...’ dialog, and nothing happens!

Don’t do this!

With few exceptions, PhysioToolkit applications run in text mode(i.e., they do not include a graphical user interface). These programs areintended to be run within a terminal emulator using a command-line interface.In most cases, if you attempt to run them by clicking on their icons or names,or by entering the program name in the MS-Windows Run... dialog box,these programs will open a DOS box, print a usage summary, and exit, usuallymuch too fast for you to read anything.

By far the best way to use these programs under MS-Windows is to install aUnix-compatible terminal emulator and shell in which to run them. The best ofthese is also free; if you have not already done so, download and install theCygwin software package.This package includes bash (the GNU Bourne Again Shell), and aterminal emulator in which to run it. After a standard installation of Cygwin,you can launch a terminal emulator and bash by clicking on the Cygwinicon that will have been installed on your desktop.

If you do not wish to use Cygwin, it is possible to run text-modeapplications under MS-Windows within a DOS box, but there are many limitationsof command.com that may prove frustrating. In particular,command.com supports a relatively small space for environmentvariables that is not secure against buffer overruns, and has idiosyncraticfilename globbing behavior.

What is a “standard input” or a “standard output”?

These concepts are common to all text mode applications (see the previousquestion). A program’s standard input is whatever it reads from thekeyboard (i.e., whatever you type into its terminal emulator window once theprogram begins to run). A program’s standard output is whatever itprints in its terminal emulator window. There are (of course) exceptions,and the exceptions are what make these ideas useful!

First, it’s possible to redirect either or both of the standardinput and the standard output before the program begins to run, by addingappropriate parameters to the command line. So, for example, a programnamed pour can read its standard input from a file namedteapot, and then write its standard output to another file namedteacup, using a command such as:

(For an explanation of this command, see the answer to the next question.)

Second, most applications have an additional standard erroroutput that is also printed in the terminal window, intermingled withthe standard output. The standard error output is reserved for warning anderror messages. If you redirect the standard output to a file, the standarderror output still appears in the terminal window (and is not copied intothe file). In most cases, this is useful behavior, since it allows you tosee quickly if there have been any errors or warnings without the need tolook through what may be lengthy output. If you wish, you can capture thestandard error output in its own file using a command such as:

How can I save the output of ... in a file?
How can one program read another’s output?

If you are running programs from a command prompt (by typing commandsinto a terminal emulator window or an MS-DOS box), these things canbe done easily.

If you have ever used GNU/Linux, Unix, or MS-DOS, you may have captured theoutput of a program by redirecting it to a file, like this:

The > operator redirects foo’s standard output (whichwould normally appear on-screen) into a file named bar. Ifbar exists already, its contents are replaced. If you wish toappend foo’s output to whatever is already contained in bar,use a command such as this instead:

There is an analogous operator that arranges for a program’s standard input(which would normally be read from whatever you type on the keyboard) tobe read from a file instead:

Here, the < operator arranges for baz to read itsinput from a file named bar. If bar was created byfoo, then this command allows baz to read foo’soutput.

You can combine input and output redirection in a single command using thepipe (|) operator:

This command runs foo and sends its standard output directly tobaz, without requiring an intermediate file. True multitaskingoperating systems such as Unix, GNU/Linux, and Mac OS X allow both programsto run (apparently) simultaneously; under MS-DOS or MS-Windows, the firstprogram runs to completion before the second one begins execution.

You can use these techniques whenever you run programs from a commandprompt, whether those programs are among those available here or obtainedfrom some other source. You can use the same techniques with programsyou write yourself; the only requirement is that your programs must read fromthe standard input and write to the standard output (i.e., they must notattempt to bypass the standard input/output mechanism by reading directly fromthe keyboard or writing directly to the screen).

These operators (>, >>, <, and|) are supported by all shells (command interpreters)under Unix, GNU/Linux, Mac OS X, and MS-DOS (including those that run withinMS-DOS boxes or other types of terminal emulators under MS-Windows). Forfurther information, please refer to the documentation for your shell orcommand interpreter.

My question is about WAVE (or gtkwave). Is there a WAVE FAQ?

Yes, look here for answers tomany frequently asked questions about WAVE. The gtkwave projectis no longer active, since WAVE now runs on all of the popular platforms,including those formerly supported by gtkwave only.

I tried to compile ... but the compiler can’t findwfdb.h (or ecgcodes.h, or ecgmap.h).

These files are included with the WFDB library. Most of the PhysioToolkitapplications use at least one of them; if you are trying to compile such anapplication, you will need to have installed the WFDB library and its *.h filesfirst. The easiest way to do this is to install the WFDB Software Package,which includes the WFDB library and many of the PhysioToolkit applications.Find instructions for doing so in the quick start guide for your platform (FreeBSD, GNU/Linux, Mac OS X (Darwin), MS-Windows, and Solaris), or on theWFDB Software Package introductorypage.

If you have already installed the WFDB Software Package and yourcompiler is still complaining, the WFDB *.h files may not be installed inany of the directories where your compiler is looking for them. Usewfdb-config to find out wherethey are.

I tried to compile ... but the compiler complainsthat isigopen (or iannopen, or strtim, or wfdbinit) is undefined.

These are among the functions defined in the WFDB library; most of thePhysioToolkit applications use at least one of these functions. If youare trying to compile such an application, it must be linked to theWFDB library. If you have not yet installed the WFDB library, see theanswer to the previous question.

For details on how to link to the WFDB library,see Compiling a Program with theWFDB Library in the WFDB Programmer’sGuide.

Asking

I’m writing a program to work with PhysioBank data,but my compiler can’t link to the WFDB library. What should I do?

If you are using one of the precompiled versions of the library, be surethat you have the correct version for use with your compiler and operatingsystem. If there is none available, you have two reasonable choices:

  • Get the WFDB library sources and, if desired, the libcurl or libwww sources, and compile them yourself. Please contribute the binary to PhysioToolkit once you have tested it and are sure it works properly.
  • Use a compiler for which a precompiled version of the library is available, such as the free and open-source GNU C compiler for GNU/Linux, Mac OS X, MS-DOS, MS-Windows, and Solaris.

Where on this site can I find software for myfavorite operating system/compiler?

Look in PhysioToolkit, the repository forall software available on this site. With very few exceptions, thesoftware available here is portable among all popular operating systems,including GNU/Linux, Mac OS X, MS-Windows, and Unix. Since all of itis provided in source form, you can compile it (using free or proprietarycompilers) into binaries that can run under any of these operating systems.

For convenience, some PhysioToolkit software is also available asready-to-run binaries for a varietyof operating systems.

Generally, the same sources can be compiled without modification under anysupported OS or compiler; you will not find separate sets of sources fordifferent compilers or platforms. Following conventions used by mostfree or open-source software, look for files named README orINSTALL in each software package; these files indicate what’s includedin the package, and how to compile it from the sources.

Most PhysioToolkit software is written in portable (ANSI/ISO standard) C.ANSI/ISO C code can be compiled by all standard C++ compilers. There is asmall amount in other languages, including Fortran 77 and Matlab/Octave m-code.If you don’t have a C or C++ compiler, we strongly recommend the excellent andfree GNU CompilerCollection (gcc), which includes C, C++, and Fortran 77 compilers (amongothers), and is available for a vast range of platforms, including GNU/Linux,Mac OS X, MS-Windows, and all versions of Unix.

If you wish to write your own software to work with PhysioBank data, theWFDB library provides standard,portable interfaces in C, C++, and Fortran for doing so. The wfdb-swig package providesPerl, Python, Java, and C# interfaces to the WFDB library. Matlab can useany of several compatible APIs.

Although it is possible to compile PhysioToolkit software using proprietarycompilers, you are generally on your own if you choose to do so; we don’t usethese compilers ourselves, and we can’t help you learn how to use them.

Can I use your code in my commercial application?

Yes. There are two different categories of PhysioToolkit code, and the rulesfor using them are slightly different.

The WFDB library is free under the GNU Lesser General Public License(LGPL). The LGPL permits you to use (or sell, or give away) the librarywith your own code. The only significant restriction is that you must make thesources for the library itself freely available. You do not need to disclosethe sources for your own code simply because you have used the WFDB librarywith it.

All of the remaining PhysioToolkit software (the applications) is freeunder the GNU General Public License(GPL). What this means in simple terms is that you can sell it or giveit away to others, but if you do so, you must distribute the sources under thesame terms as those under which you received them.

If you incorporate GPL code into your own code, the resulting code must bedistributed under the GPL or not at all; this is the so-called “viral” propertyof the GPL. What this implies is that you cannot simply make minor (or evenmajor) modifications to free code and then sell it without honoring theoriginal terms under which you received it.

There are ways to use GPL code together with proprietary code, however.For example, software that reads output from a GPL program (or that writes datato be read by a GPL program) does not automatically fall under the GPL. Asanother example, you may incorporate GPL code in a plugin for a proprietaryprogram, but the sources for the plugin itself would have to be made availableunder the GPL.

Contributors of software may choose another license conforming to the Open SourceDefinition (OSD), so that, in the future, other licenses may apply. OtherOSD licenses have provisions very similar to those outlined above.

How should I report a bug?

First, be sure that it is a bug. Try to reproduce it. Trydoing so on another computer if possible.

If you have not read How to Report Bugs Effectively, please take a fewminutes to do so. (Important: do not send bug reports about PhysioNet to theauthor of How to Report Bugs Effectively; he is an innocent bystander.)

Bug reports should provide enough specific information to permitduplicating your problem. At a minimum, this information includes:

  1. the name and version number of the software in which you found the bug,and the location on PhysioNet where the software can be found
  2. the name and version number of your operating system (e.g., Mac OS X 10.4,Fedora Core 4, Windows XP Professional)
  3. the exact command or sequence of events needed to replicatethe problem
  4. an exact copy of any text output, including any errors or warningmessages encountered
  5. the symptoms of the bug (how the output varied from what you expected,e.g., “v0 is smaller than it should be by a factor of 400”)

Do not send binary input or output files or core dumps unless requested.If you can reproduce the problem using input data available on PhysioNet,please tell us how to do so.

Asking All Them Questions Video

Carefully written bug reports are very valuable to us; we want oursoftware to work reliably, we are grateful for information that helpsus to fix defects, and we acknowledge the help of those who send ususeful bug reports. If you wish to remain anonymous, please let us knowwhen you write.

If you are able, by inspection of the sources, to locate the cause of aproblem, tell us what you discover. If you can fix the problem yourself, sendus a patch against the latest sources. These things, though very muchappreciated, are not essential components of a useful bug report, however; whatis essential is an accurate description of the symptoms of the bug.In some cases, what we think of as a feature may be what you think of as a bug;please help us understand what looks wrong to you. Without this context, apatch may be of little use to us.

All software on this site is provided in source form. If the documentationfor the software in which you have found a bug does not provide an emailaddress for bug reports, find it at the top of the source file. Please sendall bug reports to both the author/maintainer and PhysioNet.

Help!

Some links don’t work, but I don’t see any error.Why not?

On PhysioNet, links to external sites (URLs that point outside of thePhysioNet domain) are designed to open the external URL in a separate window ortab. In most cases, this window or tab will open on top (in front) of thewindow that contains the link, but your browser and your window manager oroperating system may override this behavior, especially if the second windowwas already open and was hidden. If clicking on a link doesn’t seem to doanything, check to see if there is a second browser window that is hiddenbehind other windows, or iconified (minimized, closed).

If you are sure that a link is broken, please send a note about it towebmaster@physionet.org.

I’m having trouble viewing images on this site. Why?

Most of the graphics on PhysioNet, including all of the dynamicallygenerated graphics, are PNG images. PNG has been a W3C recommendation since1996, and is one of only three standard image types that are rendered by allcurrent graphical web browsers, most of which have supported PNG for ten yearsor more. The other supported types are JPEG (which uses lossy compression andis best suited for continuous-tone graphics such as photos) and GIF (which usesa lossless compression algorithm that is inferior to PNG’s). If you have anobsolete browser, upgrading it should fix this problem.

The QuickTime plugin sometimes interferes with some browsers’ built-incapability of rendering PNG images, however, notably when using MSIE. Toavoid these problems, update or uninstall QuickTime, or use another browsersuch as Chrome or Firefox.

I’m having trouble printing PostScript files from this site. Why?

PostScript versions of books and papers available here are ready to be printedon a PostScript printer without any additional formatting. Some users haveexperienced problems, particularly with older PostScript versions of theWFDB Applications Guide (which consists of several PostScriptdocuments concatenated together into one file). Software that attempts toinsert additional PostScript code, or that attempts to reformat these filesrather than simply printing them as is, is generally the cause of thesedifficulties. MS-Windows users can use GSView to viewor print these files. Under UNIX, GNU Linux, or Mac OS X, simply print thefiles using lp or lpr, or view them using gv. (Both GSViewand gv require GhostScript to render PostScript or PDF input.) If your printer hasinsufficient memory, it may stop after printing part of the file; in this case,try using GSView or gv to print the file in sections.

If a PDF version of the file is available here, you may also wish to tryprinting it using GSView, gv, or xpdf (all three of these are free and open-source), orAdobe Acrobat Reader (free binaries, closed-source).

I don’t understand how to use the software or dataon this site.

Go back to the beginning of this FAQ and read itcarefully. Still confused? Read on....

Are you looking for something specific? Examples might include:

  • a set of data or software from a published study
  • an answer to a question
  • data of a specific type
  • software to solve a specific problem

If so, try using the Search tool. All text on the PhysioNet web site isindexed and can be found by searching for it. To do this, type one ormore terms related to your topic or question into the search box below,then click on the “Search” button to its right:

A similar search box and button appear in the top right corner ofthis and almost every other page on PhysioNet.

Have you found something relevant to your interest, but don’t know how touse it? If so, look for tutorial materials that can help you get started.Browse through PhysioNet’s list of tutorials, oruse a PhysioNet search to find information tohelp you get started.

If you have a question about a specific page in the PhysioNet website, click on the “webmaster@physionet.org” link at the bottomof that page; doing this opens a preaddressed email window, with theURL of the page filled in as the subject, which will help us tounderstand the context of your question and to give you a relevantanswer.

Before writing, please formulate specific questions (“I don’tunderstand how to use the data.” or “The software doesn’t work, please helpme!” are examples of non-specific questions that cannot be usefully answered).Whoever replies to your question cannot read your mind; if you don’t sayclearly what you need to know, you will not get a satisfactory answer.

Don’t be offended if the reply to your question is “Read the FAQ!”(this page). If the answers aren’t here, or if they aren’t clear,write again, and try to be more specific, or to point out what’sconfusing or missing in the FAQ. Doing so will not only help you toget a useful answer, but it will also help us to write a better FAQ.

What’s a man page?

A man page is a concise description of how to use a piece of software,intended to be read using man or one of its work-alikes. Think ofman pages as pages from a reference manual.

All Unix platforms, including GNU/Linux and Mac OS X, as well asCygwin/MS-Windows, include a program called man that can be used tofind and display man pages. This is the standard form ofdocumentation for all Unix software. Almost all PhysioToolkit applicationshave man pages. The near-universality of man pages means that you arevery likely to be able to learn about any program by typing its name as anargument to a man command, as in:

which will display the man page that describes the standardtar command. On most platforms, the output of man is sentthrough a program such as more, which allows you to read it onescreenful at a time; you may usually advance to the next screenful by pressingthe space bar, go back a screenful by typing ‘b’, or exit by typing ‘q’.

The format of man pages is fairly rigid, which allows a variety ofsoftware to extract useful information from them for purposes of indexing,cross-referencing, etc. They are not intended as tutorial material, butonce you are familiar with their format, reading them is usually the quickestway to learn how to use the software they document.

The largest collection of man pages on PhysioNet is the WFDB Applications Guide, which includes not onlythe man pages for the roughly 70 applications in the WFDB SoftwarePackage, but also those for a number of contributed applications that arecompatible with WFDB Software. These pages can be read within your webbrowser; if you download and install the WFDB Software Package on your owncomputer, you can use man to read the local copies of theseman pages that will be installed together with the software itself.

A distinctive feature of PhysioNet’s man pages is the Sourcessection at the end of each one, with one or more URLs that give the locationof the software sources. This feature is particularly handy if you arereading a man page in your web browser and would like to refer to thesource of a program in order to see how it is implemented.

The Computer Science Department of McGill University offers a gentleintroduction to reading man pages that will help you get started ifyou haven’t used man pages previously.

Although you won’t find the acronym “RTFM” used elsewhere on this site, itrefers to the usefulness of Reading The Fine Manual (i.e., the man pages)to inform yourself about software that may be unfamiliar. Try it!

Are PhysioNet or its mirror maintainers responsible for the content of external sites?

No.

Why isn’t my question here?

This FAQ is revised frequently, and we may not have got to yourquestion yet. It’s possible that yours is an infrequently,maybe even never-before, asked question. No matter, we’d like to hearit, and we’ll try to answer it as quickly as possible. Please sendus feedback by following the link below!

What’s the magic word?

“Please.”

If you would like help understanding, using, or downloading content, please see our Frequently Asked Questions.

If you have any comments, feedback, or particular questions regarding this page, please send them to the webmaster.

Comments and issues can also be raised on PhysioNet's GitHub page.

Updated Monday, 7 August 2017 at 09:55 EDT

PhysioNet is supported by the National Institute of General Medical Sciences (NIGMS) and the National Institute of Biomedical Imaging and Bioengineering (NIBIB) under NIH grant number 2R01GM104987-09.

Coments are closed