Page edited by Hélder Silva
View Online Hélder Silva 2012-05-15T15:00:11ZComment added by Bjarne Andersen
I must agree with Paul. On the Issue description you should describe a "problem" - why is it interesting to do this comparison at all. The description has a bit of this.
The title could e.g. be "IS28 Need for automatic change detection of web pages" based on the fact that web archives like Internet Memory actually do manual inspection of web pages at the moment - and this does not by nature scale to millions of pages.
View Online Bjarne Andersen 2012-05-14T13:06:03ZComment added by Paul Wheatley
Hi Dennis. Yes, I'm not suggesting that this is not useful work! But this page describes a solution, not a preservation issue. Re-phrasing this to describe what the actual problem is and what the requirements are for the solution, would be useful. Without these details, designing the solution correctly, and evaluating the solution, becomes very difficult.
In reply to a comment by Denis Pitzalis:Hi Paul, maybe question of rephrasing, but the measurement of the similarity of two web pages still remain an issue. The solution is the one in SO18, where the actual software to do that is introduced. People at UPMC are now working on the measure of success and benchmarking.
View Online Paul Wheatley 2012-05-14T13:01:43ZComment added by Denis Pitzalis
Hi Paul, maybe question of rephrasing, but the measurement of the similarity of two web pages still remain an issue. The solution is the one in SO18, where the actual software to do that is introduced. People at UPMC are now working on the measure of success and benchmarking.
In reply to a comment by Paul Wheatley:This is not an Issue, its a Solution! What is the problem, challenge or issue that is being experienced with a specific Dataset? I would suggest moving the text in this page to a relevant Solution page.
View Online Denis Pitzalis 2012-05-14T12:51:29ZComment added by Paul Wheatley
This is not an Issue, its a Solution! What is the problem, challenge or issue that is being experienced with a specific Dataset? I would suggest moving the text in this page to a relevant Solution page.
View Online Paul Wheatley 2012-05-14T12:45:28ZFile attached by Roman Graf
XML File mapred-site.xml (0.1 kB)
- Hadoop configuration file
File attached by Roman Graf
XML File hdfs-site.xml (0.1 kB)
- Hadoop configuration file
File attached by Roman Graf
XML File core-site.xml (0.1 kB)
- Hadoop configuration file
Page edited by Bjarne Andersen
View Online Bjarne Andersen 2012-05-08T06:59:01ZPage edited by Sven Schlarb
View Online Sven Schlarb 2012-05-07T19:40:06ZPage edited by Sven Schlarb
View Online Sven Schlarb 2012-05-07T16:38:53ZFile attached by Sven Schlarb
PNG File evaluation_50bookpairs_v2.png (48 kB)
File edited by Sven Schlarb
PNG File evaluation_50bookpairs.png (48 kB)
File attached by Sven Schlarb
PNG File Books_evaluation_chart.png (28 kB)
File attached by Sven Schlarb
File Books_evaluation_chart (28 kB)
Page edited by Nir Sherwinter
View Online Nir Sherwinter 2012-05-07T14:42:38ZPage edited by Bjarne Andersen
View Online Bjarne Andersen 2012-05-07T12:48:51ZPage edited by Bjarne Andersen
View Online Bjarne Andersen 2012-05-07T12:48:39ZPage edited by Asger Askov Blekinge
View Online Asger Askov Blekinge 2012-05-07T12:47:50ZPage edited by Simon Lambert - "Added content of "Evaluation" section"
View Online Simon Lambert 2012-05-06T19:38:46Z