M ADAMSON ASSOCIATES
  • Home
  • About
  • Services
  • Writing / Speaking
  • Blog
Picture

Bellwether Spotlight

​Creating Value & Sparking Innovation

Unlocking the Treasure Trove with Inspec Analytics

4/25/2019

 
Unlocking the Treasure Trove
File Size: 1781 kb
File Type: pdf
Download File

Like cracking the code of semantic technologies and linked data, Inspec Analytics seems like the perfect fit for an A&I service with depth of coverage, rich scientific metadata and loaded with value for the users, value for the customers, and value for the organization.
Picture
Organizational Comparison from Inspec Analytics User Guide
      Many have heard the siren’s call of linked data and the semantic technologies since first introduced in the 1990’s, only to be dashed on the rocks of the practical realities of implementation or needing to do a serious recalibration of approach.  The vision of a semantic web with hyperdata links as ubiquitous as document hyperlinks is appealing, but Tim Berners-Lee’s vision may not be realized as he first envisioned it.  However, adaptations of his dream are finding their way into early applications in financial services, healthcare and pharmaceuticals (Astra Zeneca), retail (eBay chatbot), enterprise applications (used for providing business insights, predictive modelling, repurposing and reusing content), and knowledge graphs like Google and Wikipedia. [For Google-watchers, see also Google’s recent patent profiled in OntoSpeak.]
“Semantic Technologies will continue to see steady growth and adoption but will likely never be the rallying flag on their own. I think we will continue to appreciate Semantic Technologies as an infrastructure play, in service to broader needs such as Artificial Intelligence, Machine Learning, or data interoperability. Semantic Technologies will come to assume their natural role as essential enablers, but not as keystones on their own for major economic and information change.” 
                            Michael Bergman quoted in
                            Semantic Web and Semantic Technology Trends 2018

      In publishing, there are intriguing initiatives like Inspec Analytics and Springer Nature’s SciGraph (not covered here).  In libraries, OCLC completed and published results of the third International Linked Data Survey in December 2018. Results suggest development is mostly experimental. This revealing survey, led by Karen Smith-Yoshimura and the OCLC Research Library Partnership team, includes survey results from 2014, 2015 and 2018, with insights into such projects – how respondents view measures of success, obstacles encountered, and lessons learned. 
The appeal for publishers and libraries of flexible data models is strong:
  • Improved discovery and interoperability across disparate sources and types of content
  • New types of uses for content, including just the metadata
  • Easier to perform analysis, run reports
  • Creates a direct high value conversation with the customer and new classes of users
  • Competitive advantages
  • Visualizations enabling the ability to ‘walk the graph’
     Yet there can be a combination of reasons linked data projects are not pursued or put on hold.  Several publishers have indicated no new immediate revenue streams to offset significant investment.  For libraries, respondents to the OCLC survey repeatedly cited “requires more staff and stakeholder buy-in.”  There are the challenges of large indexing projects working with ambiguous vocabularies and unclear objectives.  While the capabilities might appeal, applications are likely to languish without an easy to use front end that mines the potential.  Additional barriers to adoption to overcome include a low-level query language and huge previous investments in relational databases. 
     With this as context, the presentation of Inspec Analytics by Vincent Cassidy, Director of Academic Markets, at the February 2019 NFAIS meeting stands out as a compelling initiative because seems to deliver a pragmatic and substantive set of offerings.  Like cracking the code, it seems like the perfect fit for an A&I service with depth of coverage, rich scientific metadata and loaded with value for the users, value for the customers, and value for the organization. Intrigued by Inspec Analytics, I followed up with an interview and demo with Tim Aitken, Senior Product Manager, reflected in this piece.

 So What is Linked Data?
     In semantic web terminology and for the uninitiated, linked data is used to describe a method of exposing and connecting data [often factual content] on the web from different sources.  The web uses hyperlinks that allow people to move from one document to another.  Linked data uses hyperdata links to do something similar, i.e. Barack Obama ... attended ... Columbia University, or the University of Toronto ... publishes ... ‘x’ articles ... on bioengineering. You can extract some of this information but it is far easier using linked data where the relationships are already created.  It makes it easier for computers to make sense out of information by showing clearly defined relationships and then also link this information across different sources and types of content.  Once these relationships have been established, using the information depends on how it is accessed and served up. This includes how the linked data is searched and how it is analyzed and presented via a graphical user interface (GUI).  
“These storehouses of semantic relationships are often referred to as ‘knowledge graphs’ (the term ‘graph’ is about relationships, not visualization) or ‘triple stores’ (a ‘triple’ is a subject-predicate-object relationship). The power of a knowledge graph or triple store is that it enables you to infer relationships. You can ask it questions and it can give you answers that aren’t explicitly stored in the data. In effect, it isn’t ‘finding answers,’ it’s ‘figuring out answers.’ This is powerful!”
            Bill Kasdorf, Kasdorf & Associates

     Traditional search matches words.  This could be described as the “is” or “is not” of a traditional search.  Semantic technology together with linked data adds another layer of meaning. It adds more ‘verbs’ (predicates) like “attended,” “works at,” “is married to.”  Suddenly there are many more relationships than “is” or “is not.” These relationships are machine-readable and open the possibility to apply inference engines.  For instance:
          Fact one: Millie graduated from Stanford University.
          Fact two: Stanford University is an accredited US institution.
          Inferred fact: Millie graduated from an accredited US institution.
     While not terrifically exciting at the level of this example, the more data available for analysis, the richer and more accurate the inference results. 


The Layered Look
     How do you recognize that semantic technologies are being used as part of a product?  How “smart” is a potential resource? 
     It may not affect the quality of the product offerings, but it may define attributes related to speed, efficiency, interoperability, and flexibility for future development.  For instance, Dimensions says they have “linked research data.” That could mean more than one thing and is useful to explore to understand the full capabilities.

     In what ways is machine learning integrated (or not)?  What types of data analytics are employed?
  • Visualizations or analytics do not require semantic technologies.  Analytics and reporting can sit on top of a relational database, without the linked data. Several commercial databases seem to do just that, adding an analytic layer on top of a relational database. 
  • Linked data does not require a relational database.  If available, it offers a useful foundation for a hybrid approach.
  • A resource with linked data may have a usable but not particularly user-friendly interface.  The power is unleashed as usable infrastructure.
     One assumes if a publisher is using semantic technologies, they may want bragging rights and want the user to know they have taken that approach, because the underlying potential has advantages that can be more easily exploited. With increasing emphasis on AI and machine-learning as part of the scholarly publishing conversation, it makes sense to look under the hood to see what this means and how this impacts resource development and use. 

     Potential (discrete) components of a modern reference database. All may not be present:

Picture


     This illustrates the concept of layering discrete components ‘on top of each other’ to provide a service that may or may not include semantic technologies, depending on how many layers are implemented as part of the product design and build. 

This diagram is purposefully not the technical architecture, of which there are abundant illustrations elsewhere. 

 Working from the bottom up:
  • Traditional relational database content and taxonomies. At the base is the foundational relational database and the classification / controlled vocabularies. This, plus search and a user interface represents a traditional A&I service.
  • Linked data. The next layer up represents a new level of indexing with linked data (for simplicity, we’ve grouped RDF here to indicate a package).
  • Ontologies describe the meaning of relationships between terms, i.e. more verbs like ”is a subspecies of” or  “attends” or “is married to.” We’ve separated this from taxonomies due to how it may be employed. 
  • Search exists in traditional and semantic variations.  For instance, semantic search requires different search capabilities, i.e. Not Only SQL and SPARQL.  To see what this looks like if unfamiliar, this is a SPARQL search on Wikidata.
  • Finally, the top two layers include two separate components that can appear to be one:
    • a user-friendly interface,
    • the ability to run analytics, visualizations, and reports, retrieving metadata from the linked data tables and relational database.
     This is presented as a filter for useful inquiry into understanding the product from a customer / user perspective in evaluating a resource.  More about the specific technical architectures is beyond the scope of this piece.

Inspec Analytics: Exposing New Value
     For A&I services, excellence in discovery with high precision and recall is a strength. Discovery is enhanced with the additional linked data. Discovery is important but it is only one capability.  With Inspec Analytics, the metadata itself has new life and value, offering ways to use just the metadata for institutional profiles, significant research into who is publishing what and where, or identifying other researchers for collaboration.  This can be done at levels of extreme and flexible granularity and multiple views. Users are asking questions of the metadata itself like:
  • What is the research output from my specific institution for a particular field?
  • How am I connected to the leading authors in a particular field?
  • In which journals have my peers published?
  • Who are we collaborating with, or could collaborate with?
This is exposing new relationships with the content and services:
  • There are new types of uses by different members of the user community.
  • Searches represent a high priority by faculty and administration.
  • Inspec is experiencing increased engagement with customers at a new level of value.
  • Calls from customers include additional ideas and requests that either flow into new features or have the potential to lead to new business opportunities.

Measuring Success
     Early measurements of success include statistics that show they are using the database longer, with more frequent visits, and printing reports to share with colleagues. New types of users are enthusiastic and engaged. Librarians are also pleased that the additional services are drawing users to a quality resource. 

A Significant Transformation
     This is a strong strategic play for Inspec on multiple fronts. Despite the missional nature of the decision, they knew it would be a value-added play, not tied to additional revenue. Projects like this represent a significant commitment of time and resources.  Just considering the roughly 30-40 people involved (some shared with another project) reflects a considerable investment and new staff composition. For example, this includes 4 additional data scientists for statistical analysis, 3 developers designing and implementing linked tables, testing teams and external specialist consultants in addition to technology vendors. At a Rave Technologies talk in December, David Smith, Head of Product Solutions, joked that the first two years involved “trust” by senior management until the vision began to unfold.
     The project started in 2015, along with a decision to upgrade their entire platform and leverage that development. For the past year, 100 institutions have had access to a beta version of Inspec Analytics to provide feedback, soon to come out of beta with access by all Inspec customers as part of their subscriptions. 
  • Besides routine additions to the Inspec A&I database using existing taxonomies and ontologies, Inspec has added linked data back to 2013 (so researchers would have five years of data when they went live in 2018). They are using industry standards wherever possible, and sources like Ringgold for institutional identification. 
  • Inspec has worked with Molecular Connections to create the Analytics user interface and reports. 
  • Linked data is now an ongoing additional part of their metadata creation workflow.
    • Linked data is created separately from traditional indexing.
    • Plans to add linked data retrospectively from 2009 to present.
  • The Graphical User Interface and Analytics and Reporting layers are available only via the Inspec website.  Other platforms may link out to these tools or pursue different integration strategies. 
  • New capabilities and reports continue to be added based on input and requests from user communities. 

Conclusion
     The new flexible data models and related outputs also offers serious competitive advantages besides greater direct engagement with users looking to mine the data, suggesting fertile ground for other benefits to follow.   As Tim Aitken aptly put it when I interviewed him, clearly feeling the excitement from their user community, “Inspec Analytics has unlocked the treasure trove that is Inspec.”  We look forward to watching their space!
Inspec Analytics
https://inspec-analytics.theiet.org/
Additional visuals and explanations of feature available in Inspec Analytics User Guide.


Acknowledgements
With special thanks and appreciation to Tim Aitken, IET Inspec Analytics, and Bill Kasdorf, Kasdorf & Associates, for their time and much appreciated contributions!
 © 2019 M Adamson Associates. All Rights Reserved.



Comments are closed.

    Maureen Adamson

    Creativity in content, in services, in business models and marketing when they align into new ways of engaging the user draw my attention. This blog explores the innovations, people and trends that intrigue or inspire, offering insights into the future of publishing and scholarly communications. 

    Archives

    February 2023
    May 2019
    April 2019

    Categories

    All
    Analytics
    Business Models
    Innovation
    Open Access
    Open Scholarship
    Project Management
    Semantic Technologies

    RSS Feed

    Follow @MAdamson3
M Adamson Associates     Rye, NY    (914)925-2311     [email protected]
  • Home
  • About
  • Services
  • Writing / Speaking
  • Blog