Euro-Par 2021 : Parallel processing : 27th International Conference on Parallel and Distributed Computing, Lisbon, Portugal, September 1-3, 2021, Proceedings / Leonel Sousa, Nuno Roma, Pedro Tomás (eds.).

This book constitutes the proceedings of the 27th International Conference on Parallel and Distributed Computing, Euro-Par 2021, held in Lisbon, Portugal, in August 2021. The conference was held virtually due to the COVID-19 pandemic. The 38 full papers presented in this volume were carefully review...

Full description

Saved in:
Bibliographic Details
Corporate Author: International EURO-PAR Conference Online
Other Authors: Sousa, Leonel A. (Leonel Augusto) (Editor), Roma, Nuno (Editor), Tomás, Pedro (Editor)
Format: eBook
Language:English
Published: Cham, Switzerland : Springer, 2021.
Series:Lecture notes in computer science ; 12820.
LNCS sublibrary. Theoretical computer science and general issues.
Subjects:
Online Access:Click for online access
Table of Contents:
  • Compilers, Tools and Environments
  • ALONA: Automatic Loop Nest Approximation with Reconstruction and Space Pruning
  • Automatic low-overhead load-imbalance detection in MPI applications
  • Performance and Power Modeling, Prediction and Evaluation
  • Trace-driven Workload Generation and Execution
  • Bilas Update on the Asymptotic Optimality of LPT
  • E2EWatch: An End-to-end Anomaly Diagnosis Framework for Production HPC Systems
  • Scheduling and Load Balancing
  • Collaborative GPU Preemption via Spatial Multitasking for Efficient GPU Sharing
  • A Fixed-Parameter Algorithm for Scheduling Unit dependent Tasks with Unit Communication Delays
  • Plan-based Job Scheduling for Super computers with Shared Burst Buffers
  • Taming Tail Latency in Key-Value Stores: a Scheduling Perspective
  • A log-linear(2+5/6)-approximation algorithm for parallel machine scheduling with a single orthogonal resource
  • An MPI-Parallel Algorithm for Mapping Complex Networks onto Hierarchical Architectures
  • Pipelined Model Parallelism: Complexity Results and Memory Considerations
  • Data Management, Analytics and Machine Learning
  • Efficient and Systematic Partitioning of Large and Deep Neural Networks for Parallelization
  • A GPU Architecture Aware Fine-Grain Pruning Technique for Deep Neural Networks
  • Towards Flexible and Compiler-Friendly Layer Fusion for CNNs on Multicore CPUs
  • Smart Distributed Data Sets for Stream Processing
  • Cluster, Cloud and Edge Computing
  • Colony: Parallel Functions as a Service on the Cloud-Edge Continuum
  • Horizontal Scaling in Cloud using Contextual Bandits
  • Geo-Distribute Cloud Application at the Edge
  • A Fault Tolerant and Deadline Constrained Sequence Alignment Application on Cloud-based Spot GPU Instances
  • Sustaining Performance While Reducing Energy Consumption: A Control Theory Approach
  • Theory and Algorithms for Parallel and Distributed Processing
  • Algorithm design for Tensor Units
  • A Scalable Approximation Algorithm for Weighted Longest Common Subsequence
  • TSL Queue: An E-cient Lock-free Design for Priority Queues
  • G-Morph: Induced Subgraph Isomorphism Search of Labeled Graphs on a GPU
  • Parallel and Distributed Programming, Interfaces, and Languages
  • Accelerating Graph Applications Using Phased Transactional Memory
  • Efficient GPU Computation using Task Graph Parallelism
  • Towards High Performance Resilience using Performance Portable Abstractions
  • Enhancing Load-Balancing of MPI Applications with Workshare
  • Particle-In-Cell Simulation using Asynchronous Tasking
  • Multicore and Manycore Parallelism
  • Exploiting co-execution with one API: heterogeneity from a modern perspective
  • Parallel Numerical Methods and Applications
  • Designing a 3D Parallel Memory-Aware Lattice Boltzmann Algorithm on Manycore Systems
  • Fault-tolerant LU factorization is low cost
  • Mixed Precision Incomplete and Factorized Sparse Approximate Inverse Preconditioning on GPUs
  • Outsmarting the Atmospheric Turbulence for Ground-Based Telescopes Using the Stochastic Levenberg-Marquardt Method
  • GPU Accelerated Mahalanobis-average Hierarchical Clustering Analysis
  • High performance architectures and accelerators
  • PrioRAT: Criticality-Driven Prioritization Inside the On-Chip Memory Hierarchy
  • Optimized Implementation of the HPCG Benchmark on Recongurable Hardware.