Blog :: PCA-With great power comes great responsibility

Here's a post that Paul contributed to the JMPAhead blog:

Tasked by comrade Owen to write a two thousand word article on PCA, I thought why not combine three of my favourite geeky past-times: 1. Top Trumps, 2. superheroes, together with the villains that shape 'em, and 3. the power of graphs to improve understanding. You know "A picture is worth a thousand words", well... here are two pictures. You can do the math.

 

Ok, graphs don't always speak for themselves. This is an important observation. It helps to know something about the subject, or at least have enough interest in the subject to do the research needed to establish a frame of reference. Pictures can be powerful in the hands of those who understand what they convey and can interpret them.  If you are a chemist, like comrade Owen, then pictures relating solvents, reagents and other chemical ingredients, rather than supers, are more likely to get your spidey senses tingling.

Having combined my childhood affection for Top Trumps with my affliction for supers and statistics* , I arrived at the attached JMP data table. A rogues gallery of 58 characters together with 8 characteristics describing their powers, abilities and skills. My solitary experience of supers taught me that some characters are quite similar, according to their characteristics and abilities, while others are quite different. No doubt, from this premise high performing teams of supers were created, such as the Justice Leagues, Fantastic Four, X-men and my favourite team of misfits, Watchmen.

[*If you are unfortunate enough to be cornered by me at a party and feel compelled to reply to my opening gambit ""Hi, I'm a statistician and which superhero is most like you?" simply visit this site.]

While boring comrade Owen on a flight to Philadelphia with these facts, we bounced around the idea of asking a U.S. audience to select a basketball team of supers (a 5-aside football team for our European friends). Firstly, from a small set of top trump cards - impossible if you know little about supers; secondly, from the database pooled from all the sets of cards - not easy; and finally, by displaying and conveying the information contained within the data in pictures.

 

Data itself is not information. It is challenging to see patterns and relationships in a table full of numbers. A great man once told me "If you want to hide something from the reader, put it in a table. If you want to bury something, put it in a table in an appendix. If you want to convey something, put it in a graph." However, visualising the characters measured in 8 dimensions isn't easy. Our brains can cope with 2 or 3 dimensions. But don't despair; even supers rely on science to cope.

Taking direction from Iron Man's tag line "When man and machine combine a hero is born"... lets combine our human visual skills with the PCA data reduction tool. PCA uses the correlations that exist between the supers characteristics to compress or project an 8-D scatter on to one comprising only a few relevant axes or components (linear combinations of the original variables) that describe the major relationships and patterns in the data. These component maps and the information they contain, in the form of their corresponding navigational loadings plots, can then be displayed in the two dimensional scatter plots you see above, for a simplified, but reasonably comprehensive interpretation.

Now let us look for patterns. The 2-D component map representing the supers allows us by eye to quickly establish characters

  1. close on the map, such as the circled Colossus and Juggernaut, who, as their names suggest, are likely to have similar attributes and abilities (i.e., height and strength),
  2. diagonally opposite, such as rock-like, super-strong but short-fused impetuous member of the Fantastic Four, The Thing and his flexible, pragmatic and thoughtful leader Mr. Fantastic.

The 2-D loadings plot displays the relationships between the original characteristics and acts as a navigational aid to find your way around the component map. For instance, characters in the bottom right hand corner of the components map possess super-strength, height and endurance, while characters to the left of the plot are intelligent geniuses.

These two plots provide a visual way of helping us to identify ideal characters for our basketball team based on their characteristics and abilities. Which character would you pick to manage, or lead a team and which one would you pick to strike fear into the opposition? For a thorough explanation of how the same tool can be used to map and select ingredients for chemistry based on their physical-chemical properties check out the book by Rolf and Johan Carlson "Design and Optimization in Organic Synthesis: Revised and enlarged edition".

As a geek, I can't resist letting you know that my favourite character is The Silver Surfer. His superhuman strength, stamina and durability (health) help him to navigate through interstellar space, dimensional barriers, exceeding the speed of light on his super-cool surf board. However, his was a life of servitude rather than leadership. Note, he is the direct opposite to the world's most intelligent telepathic, Professor X, who can read, control and influence minds.

Which super is most like me? Spider-man. Cool. PCA. C'mon use it. Don't make me angry. You wouldn't like me when I'm angry.

(Link to Paul's original article on the JMPAhead blog.)

Posted on Mar 1, 2011 at 10:06AM by Registered Commenterprismtc | | EmailEmail | PrintPrint

Slidepack :: Introduction to Lean Sigma

Here's a copy of the slidepack from our recent Introduction to Lean Sigma seminar. Thanks to all those who attended and whose input and contributions added to the day. Click on the icon at the top-right of the panel below for a full screen view.

Posted on Apr 15, 2009 at 03:48PM by Registered Commenterprismtc | | EmailEmail | PrintPrint

Article: Quality by Design (UKICRS Newsletter June 2008)

We've just received our pdf copy of Paul's article that was published in the summer newsletter of the United Kingdom and Ireland Controlled Release Society (UKICRS). He writes about the Quality by Design Framework - and looks beyond simple product testing, describing how to design quality into the whole development and manufacturing process itself. The eagle-eyed amongst you will spot some illustrations featuring screenshots of our next release of CELLULA-V.

You can read the article below. Or, for full screen version, click the icon at the top right of the article window.


Posted on Jan 12, 2009 at 02:50PM by Registered Commenterprismtc | | EmailEmail | PrintPrint