{"@context":"http://iiif.io/api/presentation/3/context.json","id":"https://fiatiasa2020.aviaryplatform.com/iiif/hd7np1x329/manifest","type":"Manifest","label":{"en":["Artificial Intelligence for a role change in television archives: the ATRESMEDIA experience"]},"logo":"https://d9jk7wjtjpu5g.cloudfront.net/organizations/logo_images/000/000/123/original/IASA-FIAT-Logo.png?1603820464","metadata":[{"label":{"en":["Venue"]},"value":{"en":["a. Stage"]}},{"label":{"en":["Date"]},"value":{"en":["2020-10-27"]}},{"label":{"en":["Type"]},"value":{"en":["Paper"]}},{"label":{"en":["Agent"]},"value":{"en":["Eugenio Lopez de Quintana (Speaker)","Antonio León Carpio (Speaker)","Virginia Bazán-Gil (Moderator)"]}},{"label":{"en":["Description"]},"value":{"en":["\u003cp\u003eNew technologies based on Artificial Intelligence algorithms have been implemented in ATRESMEDIA Archive, Spain, since 2019. Focusing primarily on media assets with a strong predominance of speech content, the project covers the identification and segmentation of speakers for video content, as well as the automatic cataloguing of still images.These new functionalities have been carefully integrated in the ATRESMEDIA MAM workflows and embedded in cataloguing tools and processes. The user's search interface has also been adapted to incorporate this new source of automatically generated metadata. Optional search modes and visual distinction of metadata according to its origin are both elements of this integration, as well as the new ontology which is surfable through graphs, trees and nodes. The first objective to be achieved with this new approach to media management in ATRESMEDIA will be the significant reduction of the highly time-consuming work of segmentation and description of this type of media files, as it has been conceived as a real and tangible aid in daily operations. By leaning on technological innovations, there is a very ambitious objective underlying this project: the challenge of transforming the traditional role of television archivists, gradually putting aside such tasks as selection, cataloguing and searching in favor of another professional profile more committed to the content generation and management itself.\u003c/p\u003e"]}},{"label":{"en":["Temp"]},"value":{"en":["FIAT_IASA1117"]}},{"label":{"en":["Identifier"]},"value":{"en":["209"]}}],"summary":{"en":["\u003cp\u003eNew technologies based on Artificial Intelligence algorithms have been implemented in ATRESMEDIA Archive, Spain, since 2019. Focusing primarily on media assets with a strong predominance of speech content, the project covers the identification and segmentation of speakers for video content, as well as the automatic cataloguing of still images.These new functionalities have been carefully integrated in the ATRESMEDIA MAM workflows and embedded in cataloguing tools and processes. The user's search interface has also been adapted to incorporate this new source of automatically generated metadata. Optional search modes and visual distinction of metadata according to its origin are both elements of this integration, as well as the new ontology which is surfable through graphs, trees and nodes. The first objective to be achieved with this new approach to media management in ATRESMEDIA will be the significant reduction of the highly time-consuming work of segmentation and description of this type of media files, as it has been conceived as a real and tangible aid in daily operations. By leaning on technological innovations, there is a very ambitious objective underlying this project: the challenge of transforming the traditional role of television archivists, gradually putting aside such tasks as selection, cataloguing and searching in favor of another professional profile more committed to the content generation and management itself.\u003c/p\u003e"]},"provider":[{"id":"https://fiatiasa2020.aviaryplatform.com/aboutus","type":"Agent","label":{"en":["FIAT IASA 2020"]},"homepage":[{"id":"https://fiatiasa2020.aviaryplatform.com/","type":"Text","label":{"en":["FIAT IASA 2020"]},"format":"text/html"}],"logo":[{"id":"https://d9jk7wjtjpu5g.cloudfront.net/organizations/logo_images/000/000/123/original/IASA-FIAT-Logo.png?1603820464","type":"Image"}]}],"thumbnail":[{"id":"https://d9jk7wjtjpu5g.cloudfront.net/collection_resource_files/thumbnails/000/100/864/small/open-uri20201105-742-1wo4iqm_1604616420.jpg?1604598434","type":"Image","format":"image/jpeg"}],"items":[{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864","type":"Canvas","label":{"en":["Media File 1 of 1 - open-uri20201105-742-1wo4iqm.mp4"]},"duration":1640.933,"width":640,"height":360,"thumbnail":[{"id":"https://d9jk7wjtjpu5g.cloudfront.net/collection_resource_files/thumbnails/000/100/864/small/open-uri20201105-742-1wo4iqm_1604616420.jpg?1604598434","type":"Image","format":"image/jpeg"}],"items":[{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/content/1","type":"AnnotationPage","items":[{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/content/1/annotation/1","type":"Annotation","motivation":"painting","body":{"id":"https://aviary-p-fiatiasa2020.s3.wasabisys.com/collection_resource_files/resource_files/000/100/864/original/open-uri20201105-742-1wo4iqm.mp4?1604598418","type":"Video","format":"video/mp4","duration":1640.933,"width":640,"height":360},"target":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864","metadata":[]}]}],"annotations":[{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/transcript/22288","type":"AnnotationPage","label":{"en":["English [Transcript]"]},"items":[{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/transcript/22288/annotation/1","type":"Annotation","motivation":"transcribing","body":{"type":"TextualBody","value":"Welcome back to the 2020 Joint ISAF editorial conference. You are now joining us in this session, which will run from the. Well, which will run from 10 to 11 GMT, this session is focused on artificial intelligence, this new object of desire for archives. And we have said today two great examples about how integrate a in action archive. So please stay with us until the end of this session. We are having a question and answer session after this presentation. So you are kindly invited to register your comments. In the TED box on the screen, wireless is running and I hope you are all hearing me and well, let's start right now with artificial intelligence for our role change in television archives. The address maybe I experience by Yohanna, Lopevi, Quintana and Antonio. Leon Urania is the head of the archives at Rosemeadow, Spain. Antonio Leonys, founder and CEO. Maria Eugenia Antonio, all yours. Hello, good morning. Good evening. Good afternoon. It depends on the place you are watching this presentation match very well with the one of the claims of the conference that you can see the next slide. A transformation, in fact, the the reason, because we are using artificial intelligence and automatic processing, is to try to save time to to be able to do other things so we can do is change the profile of activist in our company situation that you can see on the screen, I think is very common for any of our companies. Materials are increasing. The granularity in the searching is increasing as well. In our case, we have a lot of queries that go very, very deep in the language, especially in the politicians. And unfortunately, we are seeing people or even less people on board on the archives.","format":"text/plain"},"target":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864#t=482.91,636.509"},{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/transcript/22288/annotation/2","type":"Annotation","motivation":"transcribing","body":{"type":"TextualBody","value":"So in this situation. What we try to do is to change this, this a scenario in which our documentary is to spend more than 70, 75 percent of the time doing this time, this kind of task, recreation and searching. But we think that the profile could be more on the on the on the right part of the screen or more related to knowledge management and content production. Because of that, we need to this idea of using automatic processing to get time to do this kind of of tasks. So let's see the concrete projects we are working with in the next slide. The possible yeah, next time, next one is OK, we say concrete because we want to emphasize that the idea is to work with a specific and concrete projects, that the two first projects we have launched are and based on transcription and segmentation. OK. OK, so a complete materials for materials or one part of our broadcast, this program is one of the goals of this project. The third is the first experience of cataloging a program without human intervention at all. And the fourth is something that will be launched at the end of this year. This is using facial recognition for cataloging stills and in the near future, maybe in the first term of the next year or so, using it for cataloging video. Antonio, can you talk a bit about algorithms that you are using? Yes, we are talking. Hello, I'm Antonio Will, I'm just solving the most important algorithms that we are going that we are in the process we will go through in the next slide. So we are working with automatic speech recognition that we all know. Also automatic subtitle sync, which means we work with the subtitles broadcasted by media in order to to have the transcription of a program of our broadcast that program.","format":"text/plain"},"target":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864#t=636.79,787.69"},{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/transcript/22288/annotation/3","type":"Annotation","motivation":"transcribing","body":{"type":"TextualBody","value":"We also work with the speaker segmentation. So we have an audio and we have to find different tiers between the different speakers. Then we have fey detection and recognition, the segmentation. We will we will talk about about this one a little bit farther. And then we have voice activity detection, the signal to noise ratio in voice segments. Again, we will talk about this specific algorithm a little bit later. So now we go. So these are the algorithms and now we splain the scenarios and the different workflows that we are working with. OK, let's go with the first the first two projects that are based on raw material on one or more parts of our program, in this case we are being very, very concrete by only processing videos where the speakers are not talking at the same time so they can have to talk in a constructive way. So one after another one is very common in political domain. And regarding the issues that are the subject topics, we are mainly working with political information that is very important also in terms of congression, because vocabulary's is very is a key issue in transcription. So in this case, the workflow is like you can see here, documentaries can send from orbit or from our own documentation system materials to the man. There are some control quality process automatically there. But in terms of quality of video or even basic metadata that come from orbit approx, you five is created and send it to media devices. We get from them on file with a lot of metadata that we can import automatically in our system. And then in the in the other part, from the right of the of the screen, there is the human intervention.","format":"text/plain"},"target":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864#t=788.17,911.899"},{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/transcript/22288/annotation/4","type":"Annotation","motivation":"transcribing","body":{"type":"TextualBody","value":"In this case, documentarist can act over the automatic information and proceed to correct or whatever in the next slide. We can see what we get from the media devices that use the segmentation in this case automatically, Don, and also transmission that correspond to this sequence created in the next one. We can see how we import that in our system. On the left part is the information automatically processed. In the right part is the screen where the documentarist can work, doing in the same in the next slide that they can import part of the text or the text or in the next slide they can. And when they have important information, they can act on that in doing some functions like correcting the text or combining the automatic sequences created, adding more than one to another one sequence. So this is the possibility of human intervention. And in the next slide, we can see the ocean of import everything. So all the automatic sequence is generated next. You said something very interesting to to to point is that if the documentaries to make an intervention on the text, the text change color, and then the users can know when the information is was automatically created or was created by a documentary. This is important because as you as someone you will explain later. We are not working. That affects your world. So we have to deal with the imperfections. And that obviously is very important to be clear for the users that some information that can either be restrained or not created by a human, but by a machine. OK, let's go with this one. In this case, we are very proud of all of you for the first program, the first program catalog automatically and without human intervention.","format":"text/plain"},"target":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864#t=911.9,1046.13"},{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/transcript/22288/annotation/5","type":"Annotation","motivation":"transcribing","body":{"type":"TextualBody","value":"The workflow is very similar. But as you can see on the low part of the screen that we use the subtitles, they still surtitles file is sent by the mom and along with the with the proxy fight with the media who may have a synchronization and the same work is done. So in the next screen, we can see how I'm automatically programed catalog is seen. And the next one know that here in the other part is the misery that created and imported by the mom through different parts of the organization, in the lower part is the automatic information created by media. Let's go with the next one. This is the numbers so far. We have processer around 4000 hours with this system and then I think we will end the year with 5000 hours. Let's go with the next project we can think of, OK? So just to make you just seen this, this is very important. We want to stress that this is not a demolition of the pilot. This is a project that is already in production, which is working, which have 4000 hours and more and more coming up. So this is very important. But in order to really achieve this in technical point of view, we have to make a lot of adaptations of the system to the reality of the is the right or the content. So it's very important for us to understand that A.I. systems are not, you know, some sankalp or you can just put the content, any kind of content and you have all the good results just so we have to make a lot of devastation through the system. And here we are responding to some of some of the most important. So firstly, we analyze what we saw that we needed in this big issue.","format":"text/plain"},"target":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864#t=1046.31,1165.28"},{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/transcript/22288/annotation/6","type":"Annotation","motivation":"transcribing","body":{"type":"TextualBody","value":"We need more precision. So we needed at least one second position of India to have good, good differentiation of the text of the different interpretations. And also I decided that intervention to shorten that three seconds wasn't really useful. So we did an intervention under three seconds. Then we we wanted to work with semantics. Some of the text at the beginning of the preamble, which we wanted to do, is to have the whole text of intervention. And then just with semantics, this clustering differentiate the different topics of this intervention. We saw that this was really, really difficult because in reality, politics talks and go back and forth on different topics to another topic and go back, etc. So it's very difficult to have really semantic clustering, which is at least possible in another type of text for this transcription where difficult. So that Carvey's team decided that they wanted to do is to have segments always under two minutes. So we will work with the whole intervention and then we cut them in segment of Terminix. But the system will have to make these cuts over three whole sentences. So the thing is to have a whole paragraph, whole sentences, to have a common sense is not a different topic. But unless you have whole sentences of these segments of two minutes, at least up to two minutes have the same meaning, more or less. So this is very important. Again, in order to have good segmentation, we need a good automatic penetration of the text, which is something that not always is there is minded, but is very important to have good competition because it is very important to have these countries in order to understand the text, because if you have just a text with a population is very difficult to read and understand and understand, the penetration is very important.","format":"text/plain"},"target":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864#t=1165.32,1276.369"},{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/transcript/22288/annotation/7","type":"Annotation","motivation":"transcribing","body":{"type":"TextualBody","value":"And also, of course, you have this segmentation of Campese entities. You have to have good punctuation to inattentional into then segment this text. And finally, we have to analyze the quality of the of the audio because we for example, have decided that if we have less than 30 percent of voice activity in our in and out, this is not useful to automaticity. So you can have your nine hour of one hour and you have five minutes of Ollivier of voice and the rest is music and noise, etc. So that is not useful and then are more important in this voice activity segment. You have to have a good signal to noise ratio in order to have good automatic results. If you don't have the signal to noise ratio because the content is difficult, you know, it is the acoustic is bad, etc., you have to decide. It is hard to not analyze this content. You have to find the signal to noise ratio and then discard the content for the automatic processing and just go to the manual one. So these things are very important not to have all this noise in the meta data. So, OK, what is the next line of projects is to working with face recognition and in Georgia only for Catalonia deals in the next year or so using it for video. In this case, the war flow is quite different. Antonio. Yeah, in this case, we we generate like a repository of faces, these faces have the basic metadata, they come for us and then we are at the victor, the biometric vector to which one of them. So our database will have the convention on metadata and the biometric metadata as well. And here we have the workflow in the generating department with core parts ramshaw or vector biometric vectors.","format":"text/plain"},"target":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864#t=1276.37,1405.469"},{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/transcript/22288/annotation/8","type":"Annotation","motivation":"transcribing","body":{"type":"TextualBody","value":"When when we have a new picture entering into the catalog, into the into the repository of faces, we send it first to the ontology and we add this entity on the ontology and domain, and then we send the picture to the media, which generates the pattern and send it to us. And we are dispatched to the repository. As you can see now in the cataloging process, as we have already, the pattern is generated. We add different pictures with the same person that you can see here. The person is alone or is with another one, which is a group, and then we send it to the media again and it may generate an new set of patterns with this set of patterns we receive, in addition, fight and fight. And here's the key issue of this project we match. The pattern was generated in this case with the pattern that we have to store it in our repository. So when the match is done, we can import the metadata of this person to the new features and then we add that to the ontology also automatically for searching. We have the three conventional way of searching that is browsing the the repository of faces. We can type of course, the name of the person in this case is a well-known chef in Spain, or we can navigate through the ontology. This is a very conventional. But what is new is we can use any picture, any picture got from any part or sample from the website or whatever. And this is sent by the user without knowing, of course. So what we've done this is sent to the media at the media generates the pattern and matches with the pattern that we have. It's the same, the same for following the in this case, them just to emphasize on my side, on the side of of in this case of client asking for solutions, we want it from the very beginning, work with concrete user cases.","format":"text/plain"},"target":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864#t=1405.78,1548.039"},{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/transcript/22288/annotation/9","type":"Annotation","motivation":"transcribing","body":{"type":"TextualBody","value":"We didn't want to work with the whole archive, with the formative years, with the whole project, that these were something that were offered by many companies when we were searching for a solution. So we prefer to go step by step and achieve something, consolidate and go to the next step in this case, one or two speakers or more speakers consecutively. And what we have explained before, the second idea is that we wanted this solution integrated in our workflow that allowed us to choose if they want to use automatic or not devices, we want to also an existing product that is very important because many companies arrive to arrive with solutions that have to be developed. And this is this is something that we didn't want to do. And so we find the media here with a very real problem that we can see we can install in a reasonable time period of time. We know that we need adaptation, constantly, adaptation. That is something very complicated in the in the workflow with with the whole activity of the television working, which always is something complicated. And also we know that the coexistence between manual and automatic is also a challenge because something can be done automatically something or you have to combine and sometimes in the same type of material, you have materials that can and other.","format":"text/plain"},"target":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864#t=1548.04,1638.999"}]},{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/transcript/22288","type":"AnnotationPage","label":{"en":["English [Transcript]"]},"items":[{"id":"https://fiatiasa2020.aviaryplatform.com/collections/1193/collection_resources/32131/file/100864/transcript/22288/annotation/10","type":"Annotation","motivation":"subtitling","body":{"type":"TextualBody","value":"https://d9jk7wjtjpu5g.cloudfront.net/file_transcripts/associated_files/000/022/288/original/1612826540_open_uri20201105_742_1wo4iqm.vtt?1616984867","format":"text/vtt","language":"en"},"target":"https://d9jk7wjtjpu5g.cloudfront.net/file_transcripts/associated_files/000/022/288/original/1612826540_open_uri20201105_742_1wo4iqm.vtt?1616984867"}]}]}]}