问题描述
在windows下的eclipse中运行报错,如下,求各位大神帮帮忙,本人小白一个,急急急急InjectorJob:startingat2014-06-2615:45:36InjectorJob:InjectingurlDir:urlsInjectorJob:Usingclassorg.apache.gora.memory.store.MemStoreastheGorastorageclass.****file:/D:/workspace/nutchTest/urlsInjectorJob:java.lang.RuntimeException:jobfailed:name=injecturls,jobid=job_local_0001atorg.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)atorg.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:233)atorg.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:251)atorg.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:273)atorg.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)atorg.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:282)
解决方案
解决方案二:
看下logs下的hadoop.log日志信息
解决方案三:
下面是日志内容2014-06-3010:07:25,234INFOcrawl.InjectorJob-InjectorJob:Usingclassorg.apache.gora.sql.store.SqlStoreastheGorastorageclass.2014-06-3010:07:25,875WARNutil.NativeCodeLoader-Unabletoloadnative-hadooplibraryforyourplatform...usingbuiltin-javaclasseswhereapplicable2014-06-3010:07:26,046WARNmapred.JobClient-Nojobjarfileset.Userclassesmaynotbefound.SeeJobConf(Class)orJobConf#setJar(String).2014-06-3010:07:26,203WARNsnappy.LoadSnappy-Snappynativelibrarynotloaded2014-06-3010:07:27,078INFOmapreduce.GoraRecordWriter-gora.buffer.write.limit=100002014-06-3010:07:27,375WARNplugin.PluginRepository-java.io.FileNotFoundException:D:workspacenutchTest.srcplugin.svnplugin.xml(系统找不到指定的文件。)2014-06-3010:07:29,968WARNregex.RegexURLNormalizer-Can'tloadthedefaultrules!2014-06-3010:07:30,343INFOregex.RegexURLNormalizer-can'tfindrulesforscope'inject',usingdefault2014-06-3010:07:32,953WARNmapred.FileOutputCommitter-Outputpathisnullincleanup2014-06-3010:07:33,703INFOcrawl.InjectorJob-InjectorJob:totalnumberofurlsrejectedbyfilters:02014-06-3010:07:33,703INFOcrawl.InjectorJob-InjectorJob:totalnumberofurlsinjectedafternormalizationandfiltering:22014-06-3010:07:33,875INFOcrawl.FetchScheduleFactory-UsingFetchScheduleimpl:org.apache.nutch.crawl.DefaultFetchSchedule2014-06-3010:07:33,875INFOcrawl.AbstractFetchSchedule-defaultInterval=25920002014-06-3010:07:33,875INFOcrawl.AbstractFetchSchedule-maxInterval=77760002014-06-3010:07:34,281WARNmapred.JobClient-Nojobjarfileset.Userclassesmaynotbefound.SeeJobConf(Class)orJobConf#setJar(String).2014-06-3010:07:34,703INFOmapreduce.GoraRecordReader-gora.buffer.read.limit=100002014-06-3010:07:35,281WARNregex.RegexURLNormalizer-Can'tloadthedefaultrules!2014-06-3010:07:35,281WARNregex.RegexURLNormalizer-Can'tloadthedefaultrules!2014-06-3010:07:35,281INFOcrawl.FetchScheduleFactory-UsingFetchScheduleimpl:org.apache.nutch.crawl.DefaultFetchSchedule2014-06-3010:07:35,281INFOcrawl.AbstractFetchSchedule-defaultInterval=25920002014-06-3010:07:35,281INFOcrawl.AbstractFetchSchedule-maxInterval=77760002014-06-3010:07:35,359INFOregex.RegexURLNormalizer-can'tfindrulesforscope'generate_host_count',usingdefault2014-06-3010:07:37,703INFOmapreduce.GoraRecordWriter-gora.buffer.write.limit=100002014-06-3010:07:37,750WARNmapred.FileOutputCommitter-Outputpathisnullincleanup2014-06-3010:07:37,750WARNmapred.LocalJobRunner-job_local_0002java.lang.NullPointerExceptionatorg.apache.avro.util.Utf8.<init>(Utf8.java:37)atorg.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)atorg.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)atorg.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)atorg.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)atorg.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
解决方案四:
请我贴住解决了吗?我也出了这个问题
解决方案五:
原因是没有找到插件地址,将nutch-site.xml中的<property><name>plugin.folders</name><value>src/plugin</value></property>改为:<property><name>plugin.folders</name><value>plugins</value></property>
解决方案六:
给个地址吧:http://my.oschina.net/DLow/blog/294951,上面这个说法有问题,第一种写法是在IDE中,第二种是用命令行方式,