Limit search to available items
Book Cover
E-book
Author Bannert, Mattias.

Title RESEARCH SOFTWARE ENGINEERING a guide to the open source ecosystem
Published [S.l.] : CHAPMAN & HALL CRC, 2024

Copies

Description 1 online resource
Contents Cover -- Half Title -- Series Page -- Title Page -- Copyright Page -- Contents -- List of Figures -- List of Tables -- Preface -- Acknowledgments -- 1. Introduction -- 1.1. Why Work Like a Software Engineer? -- 1.2. Why Work Like an Operations Engineer? -- 1.3. How To Read This Book? -- 1.4. Backlog -- 1.5. Requirements -- 2. Stack: A Developer's Toolkit -- 2.1. Programming Language -- 2.2. Interaction Environment -- 2.3. Version Control -- 2.4. Data Management -- 2.5. Infrastructure -- 2.6. Automation -- 2.7. Communication Tools -- 2.8. Publishing and Reporting -- 3. Programming 101
3.1. The Choice That Doesn't Matter -- 3.2. Plan Your Program -- 3.2.1. Think Library! -- 3.2.2. Documentation -- 3.2.3. Design Your Interface -- 3.2.4. Dependencies -- 3.2.5. Folder Structure -- 3.3. Naming Conventions: Snake, Camel or Kebab Case -- 3.4. Testing -- 3.5. Debugging -- 3.5.1. Read Code from the Inside Out -- 3.5.2. Debugger, Breakpoints, Traceback -- 3.6. A Word on Peer Programming -- 4. Interaction Environment -- 4.1. Integrated Development Environments -- 4.1.1. RStudio -- 4.1.2. Visual Studio Code -- 4.1.3. Editors on Steroids -- 4.2. Notebooks -- 4.3. Console/Terminal
4.3.1. Remote Connections SSH, SCP -- 4.3.2. Git Through the Console -- 5. Git Version Control -- 5.1. What Is Git Version Control? -- 5.2. Why Use Version Control in Research? -- 5.3. How Does Git Work? -- 5.4. Moving Around -- 5.5. Collaboration Workflow -- 5.5.1. Feature Branches -- 5.5.2. Pull Requests from Forks -- 5.5.3. Rebase vs. Merge -- 6. Data Management -- 6.1. Forms of Data -- 6.2. Representing Data in Files -- 6.2.1. Spreadsheets -- 6.2.2. File Formats for Nested Information -- 6.2.3. A Word on Binaries -- 6.2.4. Interoperable File Formats -- 6.3. Databases
6.3.1. Relational database Management Systems (RDBMS) -- 6.3.2. A Word on Non-Relational databases -- 6.4. Non-Technical Aspects of Managing Data -- 6.4.1. Etiquette -- 6.4.2. Security -- 6.4.3. Privacy -- 6.4.4. Data Publications -- 7. Infrastructure -- 7.1. Why Go Beyond a Local Notebook? -- 7.2. Hosting Options -- 7.2.1. Software-as-a-Service -- 7.2.2. Self-Hosted -- 7.3. Building Blocks -- 7.3.1. Virtual Machines -- 7.3.2. Containers and Images -- 7.3.3. Kubernetes -- 7.4. Applied Containerization Basics -- 7.4.1. DOCKERFILEs -- 7.4.2. Building and Running Containers
7.4.3. Docker Compose -- Manage Multiple Containers -- 7.4.4. A Little Docker Debugging Tip -- 8. Automation -- 8.1. Continuous Integration/Continuous Deployment -- 8.2. Cron Jobs -- 8.3. Workflow Scheduling: Apache Airflow DAGs -- 8.4. Make-Like Workflows -- 8.5. Infrastructure as Code -- 9. Community -- 9.1. Stay Up-to-Date in a Vastly Evolving Field -- Social Media -- 9.2. Knowledge-Sharing Platforms -- 9.3. Look Out for Local Community Group -- 9.4. Attend Conferences -- Online Can Be a Viable Option! -- 9.5. Join a Chat Space -- 10. Publishing and Reporting
Summary Research Software Engineering: A Guide to the Open Source Ecosystem strives to give a big-picture overview and an understanding of the opportunities of programming as an approach to analytics and statistics. The book argues that a solid "programming" skill level is not only well within reach for many but also worth pursuing for researchers and business analysts. The ability to write a program leverages field-specific expertise and fosters interdisciplinary collaboration as source code continues to become an important communication channel. Given the pace of the development in data science, many senior researchers and mentors, alongside non-computer science curricula lack a basic software engineering component. This book fills the gap by providing a dedicated programming-with-data resource to both academic scholars and practitioners. Key Features overview: breakdown of complex data science software stacks into core components applied: source code of figures, tables and examples available and reproducible solely with license cost-free, open source software reader guidance: different entry points and rich references to deepen the understanding of selected aspects
Notes Matthias Bannert, Ph.D. gained his hands-on data science and data engineering at ETH Zürich in more than a decade of working for the KOF Swiss Economic Institute. Today, he works as a data engineering expert advisor at cynkra and supports ETH as a section lead in the innovation-minded KOF Lab. In 2021, he was a co-chair of useR!, the annual user conference of the R Project for Statistical Computing. He remains an active contributor to extension packages of the R language and the open source community in general
Subject Research -- Data processing.
Computer software -- Development.
BUSINESS & ECONOMICS / Statistics
Genre/Form Electronic books
Form Electronic book
ISBN 9781040005125
1040005128
9781003286899
1003286895
9781040005132
1040005136