Time Tracking and Editing Distance Reporting 05 March 2019 by Michał Tosza What does your client want to know? The time you spent with editing, the phrases, sentences, and segments you changed to make the translation good, and the quality of their MT engine, among others. Let’s see how easily you can create reports in memoQ. Recently one of my game developer friends, Trent Gamblin, asked me to translate some string updates for his game Monster RPG 2. Apart from translating new text, he also had some old translations that he wanted to have reviewed. Unfortunately, these translations were not of the highest quality. Check it out in the screenshot below. This is just an automated QA, but you can notice untranslated segments, inconsistencies, and spelling errors. However, the translations were still acceptable, and needed only changes, not translating everything from scratch. He asked me in what way I could show him the scope of work I performed on the strings. As I knew I would use memoQ for this job, I replied that apart from corrected texts, I can either deliver him: a) a user-friendly time tracking report or b) a clear editing distance report that shows how many words and phrases I have changed and to what extent. Both of the reports serve different purposes and are useful in certain circumstances/for certain clients. But generally, they allow the reviewer to prove that the editing was done, and show the time it took, as well as the scope of changes. Time tracking is easier to understand for non-technical clients. They just get an Excel sheet that says “reviewing this translation lasted 6 hours, 35 minutes and 35 seconds”. Editing distance, on the other hand, says “see how many words and phrases I needed to change completely or partially to make this translation good”. It may seem a bit difficult to interpret, but I will get back to this in the second part of this post. In the end, Trent asked me to wait with correcting the strings, but he was so nice, as to allow me to use them in this blog post. So on the basis of issues I found, I will describe two features of memoQ that let you measure the time spent and work done on reviewing and correcting somebody else’s translations. What I like the most about Editing time and Edit distance statistics features is that they work equally well when you review human translations as well as texts translated by any machine translation engine. The reports you get are a great basis to present to your clients and show them how much work you did. Editing Time Let’s start with Editing time, as this is a more straightforward option. 1. First, you need to turn on Editing time feature in Options -> Miscellaneous -> Editing time. Tick the Record editing time when I am working checkbox. You may also set Discard intervals longer than. I mainly use 5-minute intervals. This prevents time tracking, when you do not work, eg. when you do not close memoQ and leave for a cup of coffee and get back after an hour to continue your work. This time should not be tracked as your work, so you can prevent this from happening with this option. Be advised that sometimes you may encounter very long strings and reading/correcting them may take even several minutes, so be careful with setting here very low values (such as 1 or 2 minutes). Click OK. 2. Now switch to the Overview section and click the Reports tab. The first section is Options (3), where you need to choose if you want to track the number of words you edit, or the number of characters. For me, Words is more transparent. And that’s it. Open your editor and start reviewing. After each confirmed segment you will notice memoQ displays the time you “spent in the segment”. Mind you, that in order to track the time, you need to work either in a role of “reviewer” in memoQ or if your role is “translator”, then you need the segments to be unconfirmed and always confirm them, no matter if you introduced any changes or not. When reading translation that requires no changes you also do your work and devote your time by confirming the translation is of good quality. When all required edits are done, you need to prepare a report. 1. Head back to Project home -> Overview section and click Reports tab. 2. In the bottom below Editing time section, click Create new report now. 3. The Create editing time report dialog is displayed. (1) Here you need to configure some options. Mind that you can Include spaces in character counts. They are a natural part of almost every string and correcting issues such as double spaces or spaces before punctuation marks are vital. (2) TRADOS 2007-like radio button is there just to maintain consistency with the archaic Trados 2007 – the software that had its own philosophy for word-counts. You will probably never use it unless your client still uses this legacy program. So here, leave memoQ selected. (3) Correcting tags can be very time-consuming, therefore, they should be counted as words you checked. You need to configure, “how many words is one tag”. I learned the hard way that tags can be tricky creatures and I hate correcting messed up tags. So if I have to do this, it must be profitable for me, hence I treat one tag as one word. When finished, click OK. 4. The report is ready. Click Show/Hide (1) button to display it. You see a table divided by match types and two most important items: Edit time. This is the basic measure that is important for you and your client, as most probably this is the time you will bill on your invoice. Words per hour. This measure is a piece of great information to you, as this shows how fast you work. It depends on the type of text, the quality of the translation you receive and some other factors, but nonetheless, it gives you some general information on the speed of your work. It is very useful for clients that give you a lot of lengthy review jobs within the same project. This feature will then tell you how much time you need to complete yet another 65,000 words review. 5. Having browsed all the numbers, click Export and save the report in CSV format. As I mentioned before, this is a versatile tool and allows you to track your time no matter if you accepted the recently more and more popular post machine translated job or a text translated by a fellow translator. However, if you do a lot of PEMT (post-editing machine translation) or PMTE (post machine translation editing), and your client wants a PEMT report, they will most likely ask you to deliver Editing distance report (or both time and distance reports). This is due to the fact that Editing distance report gives them valuable information on the initial quality of machine translation. If your Editing distance report shows that you introduced a lot of changes (huge Levenshtein distance – see below), it means the quality of MT was rather poor and MT engineers will have more work to do when tweaking the engine, or if you did not change a lot, that may be an indicator that the MT is quite good. Editing distance statistics report Let’s assume that your client wants you to deliver Editing distance statistics report of a PEMT file. They want to know: how many segments you edited in general, how many characters you changed in the whole job, what is the normalized edit distance. In simple terms, what does your client want to know? The time you spent with editing, the phrases, sentences, and segments you changed to make the translation good, and the quality of their MT engine, among others. How to generate such file? 1. Import the translated, bilingual document just as usual. 2. In this case, all the segments are imported as edited in memoQ and you can start reviewing. 3. When finished, go to Project home -> Translations -> right click History/Reports. 4. In the lower pane, “Minor versions of the document for major version 1” hold Ctrl and click version 1.0 (the original translation) and current (translation with your changes introduced) and compare them, clicking Calculate edit distance. 5. In the “Edit distance statistics” dialog make sure that memoQ is selected in Word counts and Levenshtein in Distance measurements sections. Click Calculate. The report pane in the bottom will display the information you need: Edited segments: this is the number of segments where any changes were introduced. Even if you add or delete a single space, this will be marked here. Absolute edit distance: this number depicts how many characters were changed. Change here means any alteration: adding, deleting, substituting. Normalized editing distance: here you see a number ranging from 0 to 1. Zero means 0% of the whole text was changed, and 1% means that the whole text (100% of it) was changed. The more changes you make, the closer to 1 this number will be. You may want to ask what “Levenshtein” option is. This is a measure called “Levenshtein distance”. Generally, this tells how much (in characters) one string differs from another. I will use the example from Wikipedia, as this is quite logical: Distance between words (strings) "kitten" and "sitting" is 3, since the following three edits change one into the other, and there is no way to do it with fewer than three edits: kitten → sitten (substitution of "s" for "k") sitten → sittin (substitution of "i" for "e") sittin → sitting (insertion of "g" at the end). Three changes equal Levenshtein distance of 3. 6. Click Export and save the report in HTML format, which is, by the way, a lot clearer than CSV. Now you can deliver this file to your client, and if they asked for this, they would know what to do with it and how to interpret the numbers. The features we discussed today are becoming more and more popular these days. As memoQ wrote in their Trend report 2019 “Prices Go Down, but Job Opportunities Go up”. To keep up with this, translators must learn to use new features offered by translation software. memoQ is quite well equipped and prepared for today's translation market, and the two reporting features we discussed today are and will be very useful to know. They allow to adapt to new requirements such as PEMT jobs and to fulfill client needs.