+ All documents
Home > Documents > OASIS Archive – Open Archiving System with Internet Sharing

OASIS Archive – Open Archiving System with Internet Sharing

Date post: 21-Nov-2023
Category:
Upload: cuni
View: 0 times
Download: 0 times
Share this document with a friend
6
OASIS Archive – Open Archiving System with Internet Sharing Juergen Enge 1 , Andrzej Głowacz 2 , Michał Grega 2 , Mikołaj Leszczuk 2 , Zdzisław Papir 2 , Piotr Romaniak 2 and Viliam Simko 3 , 1 University of Applied Sciences and Arts, Ausstellungsstrasse 60, 8031, Zurich, Switzerland [email protected] 2 AGH University of Science and Technology Al. Mickiewicza 30, 30-059 Krakow, Poland {glowacz, grega, leszczuk, papir, romaniak}@kt.agh.edu.pl 3 CIANT International Centre for Art and New Technologies Imrychova 882, 14300 Praha 4, Czech [email protected] Abstract. OASIS Archive project aimed at developing a system for the universal presentation of Media Art works independent of location. The goal was to establish a user-friendly search system in order to ensure the preservation and availability (sustainability) of cultural heritage in the field of Media Art. The metadata system interlinking databases of all participating institutions can be accessed by individual users (both researchers and general public) through an on-line interface, by multimedia archive servers engaged in exchange within a distributed system and by various play out media. Access permission is scalable depending on whether the access takes place within the system or from the outside. Keywords: storage/repositories, libraries/information repositories/publishing, feature measurement, intelligent web services and semantic web, arts 1 Introduction A huge amount of Electronic Media Art is stored by cultural institutions, art schools and content providers. These pieces of art and cultural heritage, encompass archived video tapes, digitalized pictures, voice recordings and other content. The proper documentation of archived material plays a key role in the process of cultural heritage preservation. Significant amount of artwork collections is hardly accessible for citizens due to lack of unified content description or infrastructure for storage and presentation. The first and foremost motivation of the presented OASIS Archive project is making media art available to the large public. Various databases need to be connected through a distributed metadata system that ensures the availability of media objects of cultural value for access and usage independent of location.
Transcript

OASIS Archive – Open Archiving System with Internet Sharing

Juergen Enge1, Andrzej Głowacz2, Michał Grega2, Mikołaj Leszczuk2, Zdzisław Papir2, Piotr Romaniak2 and Viliam Simko3,

1 University of Applied Sciences and Arts, Ausstellungsstrasse 60, 8031, Zurich, Switzerland

[email protected] 2 AGH University of Science and Technology Al. Mickiewicza 30, 30-059 Krakow, Poland

{glowacz, grega, leszczuk, papir, romaniak}@kt.agh.edu.pl 3 CIANT International Centre for Art and New Technologies

Imrychova 882, 14300 Praha 4, Czech [email protected]

Abstract. OASIS Archive project aimed at developing a system for the universal presentation of Media Art works independent of location. The goal was to establish a user-friendly search system in order to ensure the preservation and availability (sustainability) of cultural heritage in the field of Media Art. The metadata system interlinking databases of all participating institutions can be accessed by individual users (both researchers and general public) through an on-line interface, by multimedia archive servers engaged in exchange within a distributed system and by various play out media. Access permission is scalable depending on whether the access takes place within the system or from the outside.

Keywords: storage/repositories, libraries/information repositories/publishing, feature measurement, intelligent web services and semantic web, arts

1 Introduction

A huge amount of Electronic Media Art is stored by cultural institutions, art schools and content providers. These pieces of art and cultural heritage, encompass archived video tapes, digitalized pictures, voice recordings and other content. The proper documentation of archived material plays a key role in the process of cultural heritage preservation. Significant amount of artwork collections is hardly accessible for citizens due to lack of unified content description or infrastructure for storage and presentation.

The first and foremost motivation of the presented OASIS Archive project is making media art available to the large public. Various databases need to be connected through a distributed metadata system that ensures the availability of media objects of cultural value for access and usage independent of location.

2 J. Enge, A. Głowacz, M. Grega, M. Leszczuk, Z. Papir, P. Romaniak and V. Simko

The second motivation is automation of content description. As the detailed metadata must be manually entered for each new item in the archive, it requires a significant amount of work. In order to improve this process and make it maximally effective, a set of tools, which extract some type of data directly from content can be proposed. The system uses techniques of automated speech1, text2 and face recognition.

Currently there are many projects and supporting programmes addressing the three most crucial issues concerning by all stakeholders of media art: digitization (preservation, archives, metadata), annotation (search parameters and thesauri) and user-friendly online access (dissemination, exposure, use, search and integration software). Most projects concentrate on digitization, annotation solutions and software development. Examples include: DSpace [1], FEDORA [2], CDS Invenio, INCCA [3], Inside Installations, and The European Library [4].

The above-mentioned solutions provide content-based searching mechanisms for media art and other cultural heritage repositories. Unfortunately, the amount of artworks searched is rather limited as the system are closed and aimed at specific repositories. Therefore the aim of authors was to combine user-friendly, content-based search techniques with a universal access to vast repositories.

The foremost benefit of OASIS Archive is the preservation of media art works for future generations (sustainability). Moreover, the content of the individual servers is copied and backed-up through the communication of the servers with each other (replication technology). By this means, cultural heritage is not only documented and preserved, but also kept in a current state of availability.

Another key factor is the open, distributive infrastructure of the system - the data can remain at their physical location and linked by a comprehensive meta level. OASIS Archive is thus both scalable and extendable (inclusion of new archives and databases, access for the handicapped, i.e. audio-rendering for the blind).

Scalability of the system is supported by indexing modules (ASR, OCR, Face Recognition) for various media content (video, pictures, documents, etc.). Thus, metadata (keywords) can be obtained automatically. It is possible to search within these keywords. New modules for features extraction can be easily integrated into architecture, in order to extend the system. OASIS Archive search frontend is available online.

The Pan-European linking of institutions engaged in the collecting and documenting Media Art will open up new possibilities for exchange of cultural heritage. Moreover, the perspective of collaboration between archives and resources in different locations and on different data carriers in Central, Eastern and South Eastern Europe offers an incentive for other artists and collecting institutions to form an alliance with OASIS Archive and increase the dissemination of their own information content.

1 Automatic Speech Recognition (ASR) allows one to recognize (to convert to a textual format) words or phrases spoken in any audio-visual content. 2 Optical Character Recognition (OCR) is computer software which converts images into machine-editable text. The problem of digitalizing data is quite old, but effective OCR is a subject of current research. Current OCR systems are able to cope even with poor quality printed documents and hand-printed filled forms.

OASIS Archive – Open Archiving System with Internet Sharing 3

The rest of the paper is organized as follows. Section 2 describes the components of the system. Section 3 presents the future of the OASIS Archive and concludes the paper.

2 Description of Components of the System

This section presents the overview of the system and its components – the Indexing Agent, Middleware with DB-Adapters and the GUI.

2.1 Overview of the system

The solution proposed by the OASIS Archive architecture is presented in Fig. 1. The content provider's repositories (International Centre for Art and New Technology in Prague - CIANT, Centre for Art and Media in Karlsruhe – ZKM and Netherlands Media Art Institute in Amsterdam - MONTEVIDEO) consist of a database and a media streaming server. The database is a repository of text-based metadata such as installation name or artist information. The streaming server is the source of media. Due to different structures of partner databases and the fact, that the proposed architecture is not allowed to modify provider's database the database adapters were created. Their role is to translate the provider's database into a format, which is comprehensible by the common middleware. The providers' media servers are connected to the Indexing Agent, whose role is to perform content-based indexing of the media (ASR, OCR, Face Recognition) and provide the results to the middleware. The middleware is a data source for the common Graphical User Interface (GUI frontend). It has to be emphasized that the presented system is highly extendable and new content databases are welcome.

Fig. 1. OASIS Archive architecture.

4 J. Enge, A. Głowacz, M. Grega, M. Leszczuk, Z. Papir, P. Romaniak and V. Simko

2.2 Indexing Agent

The indexing agent implements three techniques of automated content-based indexing – Automated Speech Recognition (ASR), Optical Character Recognition (OCR) and Face Recognition.

Microsoft Speech SDK has been used for speech recognition. It is guided by a given dictionary and language model and implemented on a dedicated indexing machine due to high computation power requirements.

The OCR method used in OASIS Archive system is based on recognition of text appearing in the single frames of the analyzed video. The text is being detected, pre-processed (variants of Hough transform may be applied [6] in order to correct rotation of characters or whole lines), analyzed and post-processed to clean up the recognition results.

Face Recognition for multimedia content-based indexing system requires two steps: prior face detection in given frame and proper face recognition [7, 8] as the second step. Face detection methods use two characteristic approaches as well – based on skin and eye detection. Prior to recognition process the images of known persons should be added to the face database.

2.3 Middleware and DB-Adapters

Communication between the dissimilar components is facilitated by the middleware layer. The middleware serves as a tier between the front-end (web-interface or similar application) and individual DB-Adapters (see Fig. 1). Its main responsibilities are asynchronous communication among dissimilar components and classification of the results from DB-Adapters.

Especially the latter introduces several efficiency-related issues which had to be carefully resolved by the implementation. In order to understand these issues, one has to understand copyright-related concerns and other requirements of end-users. Suffice to say, that besides the search functionality, also the possibility of transparent sorting of the results throughout the whole network of remote databases is required. Archive metadata schema extends the Dublin Core [5] quite significantly introducing metadata related to on-line transfer and indexing.

From the front-end's point of view, there is RPC-based3 interface. The middleware itself must implement protocol for communication with DB-Adapters.

In an ideal world, a full-featured caching mechanism might have been implemented to ensure an efficient searching and retrieval of the metadata. However, due to several copyright concerns, the so-called cache-per-request mechanism had to be implemented. In such system, every single search request creates a new instance of cached metadata and erases cached metadata from the previous request. One may have noticed that this approach is a far cry from the ideal situation, however the content provider's demands had to be respected.

3 RPC – Remote Procedure Call

OASIS Archive – Open Archiving System with Internet Sharing 5

2.4 Graphical User Interfaces (Front-Ends) and Load Balancing

In the architecture any application which communicates with the middleware acts as a front-end (located at: http://search.oasis-archive.eu). As a part of the OASIS Archive bundle a web-based front-end written is provided. However, the system might be easily extended with other front-ends, placed on a separate hardware in order to allow load balancing. Multiple instances on a single server might support several graphical designs as depicted in Fig. 2.

Fig. 2. Multiple designs implemented in the front-end.

Scalability of the system is achieved by introducing several load-balancing scenarios (Fig. 3).

Fig. 3. Examples of load-balancing scenarios in the architecture.

Each of the components might (but does not have to) be deployed separately depending on the anticipated working load. One extreme is to place all the components on a single server, the other is to separate them completely or even to deploy multiple instances of them.

6 J. Enge, A. Głowacz, M. Grega, M. Leszczuk, Z. Papir, P. Romaniak and V. Simko

3 Conclusions and Further Work

This paper has presented the OASIS Archive project. The project has designed and developed an open, distributed Internet platform for research, preservation and documentation of electronic arts. The architecture can be re-used for other type of media as well.

The authors found the general OASIS Archive architecture as well as the implementations of particular modules as being well-functioning; however, sub-optimal. Therefore several modification or extensions could still be proposed.

In the first modification the goal is the maximal simplification of the installation DB-Adapter procedure. The second scenario considers both simplified installation and additional centralized (in the middleware) caching of metadata based on Resource Description Framework (RDF) .

There is effort planned also in the area of the content-based multimedia indexing. The proposed tasks include, mainly, implementation of new, improved indexing engines. This tasks will allow for more accurate indexing, search and will improve the performance of the whole indexing service. Acknowledgements. The authors would like to acknowledge the EC Culture 2000 programme for funding the OASIS Archive project as well as the EC eContentplus programme for funding the GAMA project.

References

1. Tansley, R., Smith, M.K., Harford-Walker, J.: The DSpace Open Source Digital Asset Management System: Challenges and Opportunities. Lecture Notes in Computer Science, vol. 3652, 242—253. Springer, Heildelberg (2005)

2. Payette, S., Lagoze, C.: FEDORA – Flexible and Extensible Digital Object and Repository Architecture (1998)

3. Stoebe, H., Hierl, C.: Das EU-Pilotprojekt INCCA. International Network for the Conservation of Contemporary Art. Restaurierung und Zeitgeist, vol. 18, 61—64, Wien, (2002)

4. van Der Meulen, E..: The European Library – History, Technique and User Expectations. Interlending & Document Supply 35(3), 154—156, Emerald Group, UK (2007)

5. Dublin Core Metadata Element Set, version 1.1, Dublin Core Metadata Initiative, http://dublincore.org/documents/dces (2006)

6. Hough, P.V.C.: Machine Analysis of Bubble Chamber Pictures. In: International Conference on High Energy Accelerators and Instrumentation, CERN, Switzerland/France (1959)

7. Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman. Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection. IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 19, 711—720, IEEE, New York (1997)

8. Bolme, D.S., Beveridge, J.R., Teixeira, M., Draper, B.A.: The CSU Face Identification Evaluation System: Its Purpose, Features and Structure. In: International Conference on Vision Systems (ICVS), LNCS, vol. 2626, 304—311, Springer, Heidelberg (2003)


Recommended