[可以扇男友吗小说]文本文件间的关联计算
【难题】
I am very new to this kind of work so bear with me please :) I am trying to calculate means over ranges of patterns. E.g. I have two files which are tab delimited:
The file coverage.txt contains two colums. The first colum indicates the position and the second the value assigned to that postion. There are ca. 4*10^6 positions.
coverage.txt
The second file patterns.txt contains three columns 1. the name of the pattern, 2. the starting position of the pattern and 3. end position of the pattern. The pattern ranges do not overlap. There are ca. 3000 patterns.
patterns.txt
Now I want to calculate the mean of the values assigned to the positions of the different patterns and write the output to a new file containing the first colum of the patterns.txt as an identifier.
output.txt
I think this can be accomplished usingawkbut I do not know where to start. Your help would be greatly appreciated!
自己答疑:
With four million positions, it might be time to reach for a more substantial programming language than shell/awk, but you can do it in a single pass with something like this:
This omits any patterns that don’t have any data in the coverage file; you can change the loop in theENDto loop over everything in the patterns file instead and just output 0s for the ones that didn’t show up.
【回答】
除了 awk,还能用集算器实现,代码更简短,执行速度也快。
SPL 脚本如下:
A1=file("coverage.txt").import()2=file("patterns.txt").import()3=A2.new(1,A1.select(1>=A2.2 && 1<=A2.3).avg(2))4=file("coverage.txt").export(A3)
集算器提供 JDBC 接口,能像数据库一样嵌入到 JAVA 应用程序中,用起来很简单,可参考Java 如何调用 SPL 脚本。
?
推荐阅读
-
?宝马新5系配置详解!这17款车型你最想入手哪一个?
-
黑龙江省290农场一天比一天热这钱真不好挣是用汗水换来的哎
{{if!data.isVip&&data.isActText}}{{elseif!data.isVip...
-
黑龙江干流堤防290农场段再次出现溃口
本报记者从吉林省水利厅水利厅司令部了解到,继16日再次出现宁远河后,27日7时,吉林河段堤防290农庄段悲剧重演宁远河。历经三个多...
-
黑龙江农险冰火两重天地方财力不足致补贴不一|农业保险|农险|财力
位于中俄林密吉林沿线的集贤县五原镇东方村今年遭遇洪水侵袭,许多农农作物受灾地区,农民周俊民种的200亩小麦几乎无人问津。幸好他参与...
-
黑龙江农垦290农场大雁繁育基地成为湿地生态养殖亮点
【编者按·中国军用养殖业网】日前,农牧一八〇农庄红树林自然保护区不远处,1500万头毛发亮光、身形丰满的雁在大坑里无拘无束地玩耍,...
-
鲜为人知的“料罗湾海战”——晚明与荷兰的战争
事件起因国内背景明崇祯时期,受小冰河期影响。中国北方长年干旱、中原和东部数次特大地震、北方瘟疫流行。除江浙闽粤一带受灾影响后仍然恢...
-
魏县关于进一步调整疫情封控管控措施的通告
肥乡县禽流感防控工作工作组办公室关于更进一步修正禽流感封控管控举措的通告各阶层农村居民:为统筹推进禽流感防控工作和经济社会发展,...
-
高职高考2022年可报考院校及最低录取分数线
-
高尿酸常常没有症状尿酸高可致痛风肾病和结石
-
高一学生举报老师教师节强制收礼:教师节,你准备送礼吗
立刻就要到此日了,每月那个时期,小学生家长们都心里感到恐惧,特别是新升学的小孩小学生家长,不晓得要千万别给同学赠礼,也不晓得新幼儿...