Research Applications for 3D Models in Art History

February 11th, 2009 § 4

These days, it is difficult to find a television documentary detailing an archaeological site that does not feature a representation in the form of a 3D model. Computer models make good teaching tools. A class of students may not have the opportunity to travel to Rome to view the Colosseum first-hand, and even if they did, they would have great difficulty visualizing what the mostly-ruined structure looked like 1,900 years ago. A model based on the most recent archaeological research, however, can help fill in the gaps left by time and the elements.

One of the more important aspects of a computer model is that it is dynamic. Using software, a model can be adjusted to reflect newer theories of the site’s architectural reconstruction. This is certainly a stark contrast to artists’ sketches and paintings, which, over time, tend to become outdated. Importantly, like other visualization methods used in the humanities (such as GIS), 3D models can help scholars get a fuller picture of a site and formulate research questions that never would have been considered otherwise.  This is the case in my most recent research.

Having never truly given up on the video game design aspirations of my high school days (I specifically remember my father turning the breaker off to the upstairs when I was up until 4 AM designing a Quake map), I have found a niche within my field of academic interest—Roman archaeology and architectural history. While many of my Pompeianist classmates take a more traditional approach to graduate research projects, I chose to develop a 3D model of the House of the Faun, one of the largest and most famous houses in the city. The model was constructed as accurately as possible based on the archaeological plan, a number of artists’ reconstructions, and photographs of the house (many gathered from Flickr).

The intent of the model was to test art historians’ philosophical assertions about Roman atrium houses.  With accurate lighting simulation (i. e., calibrating a simulated sunlight to the latitude and longitude of the house and to any point in time back to antiquity), high resolution images of the model rendered by Mentalray software gave me a glimpse of what the House of the Faun looked like at noon on January 1st, 100 B.C., which is something no artist can replicate.

Coincidentally, lighting simulation may have an impact on how we consider the artwork within the house. For example, when many art historians point to the colors of a mosaic as being proof of its Greek influence, can that assertion bear the burden of the fact that the mosaic was rarely in sunlight?

House of the Faun

Many of us have seen Roman floor mosaics hanging on the walls of American and European museums, but they have been removed from their original context. Even in Pompeii, one of the best-preserved sites of the ancient world, the roofs collapsed long ago, making it difficult to visualize the natural lighting scenario within the House of the Faun and other structures within the city. 3D models allow us to put artworks back in their original context and consider how the ancients viewed them, which is quite different from how we view them now. In this case, the computer model is more than just a teaching tool; it is a scholarly research tool.

Peer Review for Visual Aids?

February 4th, 2009 § 1

How frustrating is this: You sit down to take in some form of scholarly work (be it a book, an article, or a talk) and you find yourself increasingly confused with a bombardment of information from graphs and figures and maps which don’t make sense because they either have too much or too little information contained within them or the information is poorly labeled (if at all).  Or even worse, you are the person writing the book/article or giving the talk and instead of fielding questions on your scholarly processes, you are repeatedly explaining to the audience what your visual aids actually represent.

A picture may be worth a thousand words, but if it is not a language your audience speaks, where have your efforts gotten you?
Typically, when I read a scholarly article, my first read-through goes as follows: I read the abstract, I look at each one of the figures/maps/tables/graphs and their annotations, and I read the conclusion.  Its not until the second read-through that I examine the bulk of the text.  I think that words sometimes have the unfortunate tendency to obfuscate the true findings of research and, truth be told, I like to find out if I draw the same conclusions from the provided data as the author(s) do.  My process stumbles when I encounter articles with figures/graphs/maps etc. which have either a glut or a dearth of information contained in them, making non-intuitive to the uninitiated reader.  Some highlights:  A map of a state containing rivers, waterbodies, and watershed boundaries (the focus of this particular article) AND all of the major roads and highways (NOT the focus of the article).  All in gray-scale.  Add in the point locations and names of the state’s twelve most populous cities and cram it into a box three inches tall by five inches wide.  The focus of the article was on modeling and delineating the major and minor watersheds of the area in order to develop a best management practice for cooperating water districts.  Needless to say, that point was lost in the shuffle.  Another example which is all too common: a graph depicting change over time of 10 or more constituents using various dotted, dashed, and solid lines of variable thickness.  With that amount of information crammed into a single visual aid, the results are simply lost in the shuffle.

We have writing clinics and public speaking critique sessions, why don’t we have a peer evaluation system for visual aids?  I think that many people (myself included) fall into a habit of having our material critiqued solely by our close working group.  While this is certainly a necessary step in the writing process–the people most familiar with our work are the ones most likely to pick up on the esoteric flaws–many scholars neglect to obtain peer review from individuals tangential to or completely outside of their small fields.  I would say that one of our main objectives as scholars is to use our work to excite interest from members of the scholarly community inside and outside of our focused area.   In my opinion, an important step towards this goal is to make our visual aids more accessible to the curious non-expert.

I would like to see our scholarly community develop this type of peer-review network where we can utilize the human resources around us to improve our intellectual contribution to all of our respective fields.  We could have minds from a variety of fields of study working collaboratively to improve the accessibility (and therefore the use) of our collective body of knowledge.   I think the concept has amazing potential.

Social Media and the Inauguration

January 16th, 2009 § 2

Social Media in the SLab Join us in the Scholars’ Lab Monday morning through Wednesday night next week, as we project the social media landscape surrounding next week’s historic presidential inauguration.

We’ll be showing real-time Twitter and Flickr feeds that record people’s responses to the event and their efforts at citizen-journalism. We’ve also created a home-grown geospatial visualization so that you can follow the worldwide conversation!

Visit the Lab for a little social interaction of your own, or access the site (which includes more information and related links) online.

Map “Vocabularies”

November 19th, 2008 § 1

For the past year, I have been working on the Scholars’ Lab Geospatial Data Portal, the lab’s effort to make our GIS data sets readily available to UVA students, faculty, and staff via the world wide web by using a suite of open source, open standards-based applications. A particular aspect of this project that I have enjoyed exploring is the way in which we display our visual information.

Stop to think about the last paper map you used. Minor roads were probably displayed with a line of a certain color and thickness, highways with another. Green spaces were colored differently from open water and buildings etcetera. Cartographers have long toiled to develop visual representations of our environment and make them identifiable for the greater use. People naturally associate certain colors on a map with identifiable features in their environment (eg: the association of green on a map to forests, parks, and open areas). Much like a book, these symbols and representations must create a language which is understandable to the audience; else the information contained on the map will go unutilized.

What I have done for the Geospatial Data Portal is to expand our symbolic vocabulary. I create styles; XML based documents which allow us to display visual information through symbols that our patrons will understand and identify with specific attributes. An example: I can map the waterlines for a given city with a solid pink line with a width of 2 pixels. While it is true that the information is mapped and is useful to an extent, I think there is a way to display the same information while making it more visually recognizable as city waterlines and ultimately making the information more useable to our patrons. Instead of a solid pink line of a single width, we can display the information as blue lines with differing widths dependant upon the size of the pipe (ex: a main line feeder pipe with a diameter of 15ft is represented as a blue line with a pixel width of 8, whereas a small pipeline with a diameter of 2ft is represented with a blue line with a 1 pixel width.

So what has this accomplished? People tend to associate size on a map with importance in the real world, so by exaggerating the size difference of the pipe by weighting pixel width we can draw our users’ attention to the important locations on the map. And by using blue, we identify our information of interest as a water feature because most people associate blue on a map with water features in their environment. Now our patrons are able to go from displaying simple lines on a page to creating a map which displays intuitively symbolized information using only their internet browser. I believe this project has the potential to greatly expand the user-base for our GIS data sets and allow for new forms of scholarship because it makes the process of displaying information in an identifiable and comprehensible much more user friendly.

Biblical Statistics

October 9th, 2008 Comments Off

The first topic that I chose for my dissertation in UVA’s Department of Religious Studies was the “School of Saint Paul.” I hoped to show the existence of a group of followers who surrounded Paul and engaged with him in the interpretation of the Old Testament. In order to do this, I decided to investigate how Paul used scripture in his epistles and how the followers of Paul used the same scripture in their writings. I anticipated finding certain portions of the Old Testament that either were used exclusively in the Pauline and post-Pauline literature or were used differently in the Pauline and post-Pauline literature than in the rest of the New Testament.

But I had a problem. The fund of Pauline and post-Pauline quotations and allusions to the Old Testament numbered more than 1000 cases. How could I represent such a large set of data in a way that made them easily comprehensible? A friend of mine suggested that I needed to represent the data graphically. And a colleague here at the Scholars’ Lab, where I work as a graduate consultant, advised SPSS as the best way to accomplish such graphical representation.

I already had a table that I had made in Microsoft Word of every usage of the Old Testament in both the Pauline and post-Pauline literature. I needed to get this data into SPSS with as little headache as possible. So, I converted the data into an Excel file, saved this in a format that SPSS could read, and then imported it into SPSS. At that point, I had accomplished the hard part. All that was left to do was to analyze and graphically represent this data. And here is one example of what I produced:

Genesis 1-3 in the New Testament

Unfortunately, this analysis of the data made it clear that the evidence was insufficient for my dissertation! I found no significant chunks of the Old Testament that were used exclusively in the Pauline and post-Pauline literature. And I discovered that trying to set the Pauline and post-Pauline use of scripture against that of the rest of the New Testament was speculative, at best. I ended up having to change my dissertation topic. But it was this statistical analysis and work in information visualization that made it clear to me that the evidence was insufficient. Without it, it is possible that I would still be chasing the wild goose that was my previous topic.

How to Measure Text?

September 9th, 2008 § 4

…the words we join have been joined before, and continue to be joined daily. So writing is largely quotation, quotation newly energized, as a cyclotron augments the energies of common particles circulating.

- Hugh Kenner, The Pound Era

This month marks the beginning of the complicated process of starting up the Large Hadron Collider, the world’s largest particle accelerator (Kenner would haved called it a “cyclotron”), buried beneath the Franco-Swiss border. Near the top of the LHC’s agenda is having a peek into the fabric of space-time to see about the Higgs-Boson, the theorized source of mass.

But to do so they’ll need data–lots of data. According to CERN, the event summary data extracted from the collider’s sensors will produce around 10 terabytes daily. That is something like, to use the cliché, the equivalent of a Library of Congress’s worth of data every day (the raw data is much much greater).

The physics involved is obviously too complicated for a mere humanities major to discuss in any intelligent way. The interesting thing is the disparity between the sheer amount of data with which the LHC deals, as compared with the scale of the (textual) data of the humanities. How can the LHC, in a single day, focussed on a highly specific set of questions, produce as much information as the literary output of humans represented by the Library of Congress? Why, in short, is the textual data of the humanities so much smaller than the data produced by the LHC?

It is, of course, in some ways a silly, completely naive question. But the differences, in size alone, of these two datasets are nevertheless instructive and worthy of consideration. We might oversimplify the matter, and say that the LHC’s data, collected from its sensors and culled by its arrays of servers, is fundamentally information-poor data. The challenge faced by the LHC project is sorting through the complexities of the data to find the relevant information that will allow physicists to answer the questions they have. Language, by contrast, is information rich–so rich that our challenge is not how to separate the wheat from the chaff, but how to deal with the sheer flood of information compressed in text.

It is this fact that explains the disparity in size between the LHC’s data and the textual record of the humanities. The textual data of the humanities comes “preorganized” by language. While our digital texts encode only strings, language fills texts with syntactic and semantic information of which our systems of markup are completely oblivious.

Martin Wattenberg at IBM’s Watson Research Center puts it well in his interview with Wired when he describes language’s ability to compress information:

Language is one of the best data-compression mechanisms we have. The information contained in literature or email, encodes our identity as human beings. The entire literary canon may be smaller than what comes out of particle accelerators or models of the human brain, but the meaning coded into words can’t be measured in bytes. It’s deeply compressed. Twelve words from Voltaire can hold a lifetime of experience.

What happens if we take this understanding of language seriously? How would it change the way we deal with textual data?

Right now we have plenty of digital texts available, but in order to get the information out of the textual data we have to read it. Right now, only by reading do we attend to the specifically linguistic nature of textual data. Existing text analysis technologies and techniques remain largely quantitative, relying on machine learning techniques to classify texts that are represented by vectors of frequency counts. Key sources of linguistic information, however, like syntax, remain fundamentally unexploited. We are still, in effect, discarding some of the most basic sources of textual information–such as the order in which the words occur (seriously).

One avenue, though admittedly crude, is to use a technique like part-of-speech tagging to supplement raw text with part-of-speech tags which provide a fuller, more information-rich digital representation of the linguistic data. By analysing such part-of-speech tags, taking them in pairs, or looking at where in a sentence they occur, we get some sense of how a writer uses language. We step, in short, over the threshold from a purely quantitative view of language use (e.g. how many times does “of” occur per thousand words? what are the most frequently occurring terms?), to a mode of analysis that is able to extract the sort of information that we, humans, are able to when we read. Such techniques are admittedly crude; but they begin to recapture the fundamentally linguistic nature of textual data which is too easily discarded in representations of natural languages. To truly capitalize on the information contained in textual data requires finding more ways to digitally attend to the specifically linguistic nature of textual data.

We are trying to read the finely wrought braille of language through the burlap sack that current digital tools offer. With the combination of natural language processing tools (such as POS taggers, parsers, etc) and ever-more sophisticated machine learning techniques, we may be able to get closer. Humanities data is not, necessarily, smaller–it is just more compressed.

Where Am I?

You are currently browsing the Visualization and Data Mining category at the Scholars' Lab.