Registro Completo |
Biblioteca(s): |
Embrapa Agrobiologia. |
Data corrente: |
30/05/2019 |
Data da última atualização: |
19/11/2019 |
Tipo da produção científica: |
Artigo em Periódico Indexado |
Autoria: |
CRUZ, S. M. S. da; NASCIMENTO, J. A. P. do. |
Afiliação: |
Sérgio Manuel Serra da Cruz; JOSE ANTONIO PIRES DO NASCIMENTO, CNPAB. |
Título: |
Towards integration of data-driven agronomic experiments with data provenance. |
Ano de publicação: |
2019 |
Fonte/Imprenta: |
Computers and Electronics in Agriculture, v. 161, p. 14-28, 2019. |
DOI: |
https://doi.org/10.1016/j.compag.2019.01.044 |
Idioma: |
Inglês |
Conteúdo: |
With improvements in computing and communications, the amount of scientific data in agriculture has been exploding. Thus, researchers must rely on computational simulations to model the data-driven in silico agronomic experiments, the in silico experiments are those that are completely executed by using computer models. Reproducibility, transparency, independent verification are major features of Science. However, even agricultural research of exemplary quality may have irreproducible empirical findings because of random or systematic error. Funding agencies, researchers, and reviewers are demanding improved processes and the use of open data to increase reproducibility of those experiments. Currently, there are no scientific criteria to evaluate the integration of data-driven agronomic experiments with data provenance. We propose RFlow, a framework that aid researchers to manage, share, and enact the scientific in silico experiments of research projects that use reusable R scripts. The framework uses open data standards and transparently captures provenance of the agronomic experiments. RFlow is non-intrusive, can be connected to workflow systems and does not require researchers to change their working way. Our computational experiments show that the framework can collect provenance metadata and enrich a scientific project. This study shows how RFlow can serve as the primary integration platform for statistical systems, like R, with implications for other data and compute-intensive agronomic projects. As a proof of concept, we show the concrete effectiveness and expressive power of the RFlow which was evaluated through a set of data-driven agronomic in silico experiments and provenance SQL queries that exemplifies what kind of information was gathered. MenosWith improvements in computing and communications, the amount of scientific data in agriculture has been exploding. Thus, researchers must rely on computational simulations to model the data-driven in silico agronomic experiments, the in silico experiments are those that are completely executed by using computer models. Reproducibility, transparency, independent verification are major features of Science. However, even agricultural research of exemplary quality may have irreproducible empirical findings because of random or systematic error. Funding agencies, researchers, and reviewers are demanding improved processes and the use of open data to increase reproducibility of those experiments. Currently, there are no scientific criteria to evaluate the integration of data-driven agronomic experiments with data provenance. We propose RFlow, a framework that aid researchers to manage, share, and enact the scientific in silico experiments of research projects that use reusable R scripts. The framework uses open data standards and transparently captures provenance of the agronomic experiments. RFlow is non-intrusive, can be connected to workflow systems and does not require researchers to change their working way. Our computational experiments show that the framework can collect provenance metadata and enrich a scientific project. This study shows how RFlow can serve as the primary integration platform for statistical systems, like R, with implications for other data and compute-i... Mostrar Tudo |
Palavras-Chave: |
Reproducible research; Scientific workflows. |
Thesaurus Nal: |
Agriculture; Data analysis; Provenance. |
Categoria do assunto: |
X Pesquisa, Tecnologia e Engenharia |
Marc: |
LEADER 02443naa a2200205 a 4500 001 2109458 005 2019-11-19 008 2019 bl uuuu u00u1 u #d 024 7 $ahttps://doi.org/10.1016/j.compag.2019.01.044$2DOI 100 1 $aCRUZ, S. M. S. da 245 $aTowards integration of data-driven agronomic experiments with data provenance.$h[electronic resource] 260 $c2019 520 $aWith improvements in computing and communications, the amount of scientific data in agriculture has been exploding. Thus, researchers must rely on computational simulations to model the data-driven in silico agronomic experiments, the in silico experiments are those that are completely executed by using computer models. Reproducibility, transparency, independent verification are major features of Science. However, even agricultural research of exemplary quality may have irreproducible empirical findings because of random or systematic error. Funding agencies, researchers, and reviewers are demanding improved processes and the use of open data to increase reproducibility of those experiments. Currently, there are no scientific criteria to evaluate the integration of data-driven agronomic experiments with data provenance. We propose RFlow, a framework that aid researchers to manage, share, and enact the scientific in silico experiments of research projects that use reusable R scripts. The framework uses open data standards and transparently captures provenance of the agronomic experiments. RFlow is non-intrusive, can be connected to workflow systems and does not require researchers to change their working way. Our computational experiments show that the framework can collect provenance metadata and enrich a scientific project. This study shows how RFlow can serve as the primary integration platform for statistical systems, like R, with implications for other data and compute-intensive agronomic projects. As a proof of concept, we show the concrete effectiveness and expressive power of the RFlow which was evaluated through a set of data-driven agronomic in silico experiments and provenance SQL queries that exemplifies what kind of information was gathered. 650 $aAgriculture 650 $aData analysis 650 $aProvenance 653 $aReproducible research 653 $aScientific workflows 700 1 $aNASCIMENTO, J. A. P. do 773 $tComputers and Electronics in Agriculture$gv. 161, p. 14-28, 2019.
Download
Esconder MarcMostrar Marc Completo |
Registro original: |
Embrapa Agrobiologia (CNPAB) |
|