Your Ultimate Resource for Search Engine Optimization

SEO Journal

Subscribe to SEO Journal: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get SEO Journal: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

SEO Authors: Jerome McFarland, Samuel Scott, Elizabeth White, Pat Romanski, Hovhannes Avoyan

Related Topics: SEO Journal, AMD Virtualization Journal

Blog Feed Post

R and Linear Algebra

by Joseph Rickert I was recently looking through upcoming Coursera offerings and came across the course Coding the Matrix: Linear Algebra through Computer Science Applications taught by Philip Klein from Brown University. This looks like a fine course; but why use Python to teach linear algebra? I suppose this is a blind spot of mine: MATLAB I can see. That software has a long tradition of being used in applied mathematics and engineering applications. The Linear Algebra course from MIT open courseware is based on  MATLAB and half the linear algebra books published by SIAM use MATLAB. I expected MATLAB and Python seems like a stretch, but where do we stand in the R world vis-a-vis linear algebra? Well, from a pedagogical point of view there does not appear to be much material “out there” that specifically relates to teaching linear algebra with R. It seems that not much has changed since this 2009 post. A google search  yields some nice, short documents (Hyde, Højsgaard, Petris, and Carstensen) that look like the online residue from efforts to teach linear algebra using R. And, although most introductory R books have some material devoted to linear algebra (e.g. the extended markov chain in The Art of R Programming), one would be hard pressed to find a book entirely devoted to teaching linear algebra with R. Hands-On Matrix Algebra Using R: Active and Motivated Learning with Applications by Hrishikesh D. Vinod is the exception. (This looks like a gem waiting to be discovered.) My guess, however, is that we will see much more introductory R material focused on linear algebra as scientists and engineers with computational needs outside of statistics proper discover R. R is, after all, well suited for doing the matrix computations associated with linear algebra, and here are some reasons: (1) As a language designed for doing computational statistics, R is built on an efficient foundation of well-tested and trusted linear algebra code. From the very beginning, R was good at linear algebra. The vignette 2nd Introduction to the Matrix package lays out a little of the history of R’s linear algebra underpinnings: Initially the numerical linear algebra functions in R called underlying Fortran routines from the Linpack (Dongarra et al., 1979) and Eispack (Smith et al., 1976) libraries but over the years most of these functions have been switched to use routines from the Lapack (Anderson et al., 1999) library which is the state-of-the-art implementation of numerical dense linear algebra. (For example, the base R functions for computing eigenvalues, eigen(), Cholesky decompositions, chol(), and singular value decompositions svd() all use LAPACK or LINPACK code.) (2) R’s notation, indexing and operators are very close to the matrix notation which mathematicians normally use to express linear algebra, and there are basic R functions for several matrix operations (See Quick R for a summary) (3) The way R functions operate on whole objects: vectors, matrices arrays etc model very closely the conceptual processes of manipulating matrices as single entities. (4) The seamless interplay between the data frame and matrix data structures in R make it easy to populate matrices from the appropriate columns in heterogeneous data sets. (5) R is an extensible language and there is a considerable amount of work being done in the R community to "go deep", to "go sparse" and to "go big". Going Deep “Going Deep” means making it easy to access the computational resources that may be necessary for building production level applications. To this end, the Rcpp package makes it easy to write R functions that call C++ code to do the heavy lifting. Moreover, the RcppArmadillo and RcppEigen packages provide direct and efficient access to the C++ Armadillo and Eigen libraries for doing linear algebra. Going Sparse Modern statistical applications on even moderately large data sets often produce sparse matrices. One way to work with them in R is via Matrix, a “recommended” package that provides S4 classes for both dense and sparse matrices that extend R’s basic matrix data type. Methods for R functions that work on Matrix objects provide access to efficient linear algebra libraries including BLAS, Lapack CHOLMOD including AMD and COLAMD and Csparse. (Getting oriented in the world of linear algebra software is not an easy task, I found this chart from an authoritative source helpful.) The following code which provides a very first look at the Matrix package shows a couple of notable features: (1) the Matrix() function evaluates a matrix to determine its class and (2) once the Cholesky factorization is computed it automatically becomes part of the matrix object. # A very first look at library(Matrix) set.seed (1) m <- 10; n >

Read the original blog entry...

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid