Some Basic Tips
R
code can be entered into the command line directly or saved to a script, which can be run inside a session using thesource
function- Commands are separated either by a
;
or by a newline. R
is case sensitive.- The
#
character at the beginning of a line signifies a comment, which is not executed.
R
stores both data and output from data analysis (as well as everything else) in objects- Things are assigned to and stored in objects using the
<-
or=
operator - A list of all objects in the current session can be obtained with
ls()
Dataset Files
R
works most easily with datasets stored as text files. Typically, values in text files are separated, or delimited, by tabs or spaces:
gender id race ses schtyp prgtype read write math science socst 0 70 4 1 1 general 57 52 41 47 57 1 121 4 2 1 vocati 68 59 53 63 31 0 86 4 3 1 general 44 33 54 58 31 0 141 4 3 1 vocati 63 44 47 53 56 |
- or by commas (CSV file):
gender,id,race,ses,schtyp,prgtype,read,write,math,science,socst 0,70,4,1,1,general,57,52,41,47,57 1,121,4,2,1,vocati,68,59,53,63,61 0,86,4,3,1,general,44,33,54,58,31 0,141,4,3,1,vocati,63,44,47,53,56 |
Using > dim(dataset name), we get the dimensions of the dataset, i.e., the number of observations(rows) and variables(columns)
Using > str(dataset name), we get the structure of the dataset, including the class(type) of all variables