Internship Post #6 – The Digital Historian

View Post

Post #6: What skills or knowledge from your coursework are you using in your internship? Have you noticed a difference between theory and practice? Why or why not?

When discussing my studies at the Digital Humanities (DH) program at George Mason University, and my internship at the Smithsonian Institution, the difference between theory and practice becomes evident. The coursework has been very valuable in defining the field of Digital Humanities and its potential applications. But the internship, while reinforcing the importance of DH, especially in capturing, and sharing knowledge, has also pointed out the struggle many organizations face in leveraging advances in computing technologies in the workplace.

In looking back at some of my earlier class assignments, I came across this paragraph I wrote defining DH.

“While all fields of academia have been – digitized – the Humanities appears to have gone through a more contorted transformation. This is a direct result of the impact computing has on digitizing text, and the corresponding development of taxonomies, ontologies, and metadata. Early definitions of DH primarily focused on the collection of data, initially textual, that enabled academics to “objectify” their thoughts and concepts and to make them more public. This reflects the early digital technologies that leveraged word processing. The ability to search large corpus of text and analyze and research patterns of information was groundbreaking.”

My internship experience, on both American Ginseng, and now the Earth Optimism project, demonstrates that there is an evolutionary process in adopting DH technologies. For example, the Smithsonian embraced content management technologies, especially digital asset management years ago. But for the most part this has been more text based. Today, there is a demonstrable need for media asset management, especially with the exponential growth of audio and video files. As a result, the need to automate the process of summarizing media content by extracting metadata, is becoming a high priority.

The Earth Optimism Project over the past several years has developed large quantities of informative environmental videos, many linked on Smithsonian websites. But with no process of extracting metadata from these stored media files the organization faces a significant problem, how to make them more discoverable, and accessible? So far, my internship has taught me, that most people can define what the problem is that needs to be solved, but not how their organization can solve it. From other work experiences it is certain that the very nature of large, bureaucratic organizations, prevent agile procurements and adoption of new technologies and processes. As a result, the Smithsonian, while demonstrating its strength in developing and producing educational content, also recognizes that it has a significant gap in their processes of distributing and repurposing this media. Especially in today’s social media environment that leverages metadata, tagging, and content integration.

The major skills or knowledge that I am drawing upon from my coursework is that the broadening definition of Digital Humanities encompasses the reality that it is no longer about the technology but rather what an organization can do with it and who ultimately can have access to the information. What I am learning from the internship is that the Smithsonian, like other organizations, is challenged by their traditional role as a gatekeeper of knowledge. But recognizing that information technology, especially Social Media, requires them to transforms the way the way they interact with audiences. As a result, the Smithsonian needs to reevaluate its strategic planning and the introduction of computing technology, especially artificial intelligence, to assist in sharing information and engaging with people seeking media content.

The Smithsonian obviously recognizes the need for investing in cognitive services like media indexing, tagging, transcribing, and translation technologies to automate content summarization and integration. However, in the past these services were expensive and difficult to integrate with existing applications. But today these cloud based technologies have become quite common, especially in the entertainment industry. For example, Netflix and other streaming providers use content summarization to assist audiences with searching and selecting programs. Finally, and most importantly, the cost of these services is dramatically decreasing. So the return on the investment (ROI) becomes more evident when comparing the human hours required to extract metadata from media files versus an automatic process. I expect in the next several years, the Smithsonian will introduce cognitive services across the enterprise and automate their metadata extraction. It will be another evolution of the impact that digital humanities is having on the workplace.

Leave a Reply Cancel reply