Skip to content

Conversation

cachemeifyoucan
Copy link
Collaborator

Add output backend that can be used to virtualize outputs from all LLVM based tools. This can easily allow functions like:

  • redirect the compiler output
  • duplicating the compiler output
  • output to memory instead of to file system
  • set the output to NULL
  • etc.

This change also makes creating different kind of output from LLVM tools easier, and also provides correct error handling from the process.

@llvmbot llvmbot added clang Clang issues not falling into any other category llvm:support labels Oct 6, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 6, 2023

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-llvm-support

Changes

Add output backend that can be used to virtualize outputs from all LLVM based tools. This can easily allow functions like:

  • redirect the compiler output
  • duplicating the compiler output
  • output to memory instead of to file system
  • set the output to NULL
  • etc.

This change also makes creating different kind of output from LLVM tools easier, and also provides correct error handling from the process.


Patch is 132.10 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/68447.diff

23 Files Affected:

  • (modified) clang/include/clang/Frontend/CompilerInstance.h (+21-15)
  • (modified) clang/lib/Frontend/CompilerInstance.cpp (+55-118)
  • (added) llvm/include/llvm/Support/HashingOutputBackend.h (+112)
  • (added) llvm/include/llvm/Support/VirtualOutputBackend.h (+62)
  • (added) llvm/include/llvm/Support/VirtualOutputBackends.h (+110)
  • (added) llvm/include/llvm/Support/VirtualOutputConfig.def (+26)
  • (added) llvm/include/llvm/Support/VirtualOutputConfig.h (+91)
  • (added) llvm/include/llvm/Support/VirtualOutputError.h (+134)
  • (added) llvm/include/llvm/Support/VirtualOutputFile.h (+162)
  • (added) llvm/include/llvm/Support/raw_ostream_proxy.h (+158)
  • (modified) llvm/lib/Support/CMakeLists.txt (+6)
  • (added) llvm/lib/Support/VirtualOutputBackend.cpp (+38)
  • (added) llvm/lib/Support/VirtualOutputBackends.cpp (+594)
  • (added) llvm/lib/Support/VirtualOutputConfig.cpp (+50)
  • (added) llvm/lib/Support/VirtualOutputError.cpp (+53)
  • (added) llvm/lib/Support/VirtualOutputFile.cpp (+106)
  • (added) llvm/lib/Support/raw_ostream_proxy.cpp (+15)
  • (modified) llvm/unittests/Support/CMakeLists.txt (+5)
  • (added) llvm/unittests/Support/VirtualOutputBackendTest.cpp (+147)
  • (added) llvm/unittests/Support/VirtualOutputBackendsTest.cpp (+886)
  • (added) llvm/unittests/Support/VirtualOutputConfigTest.cpp (+152)
  • (added) llvm/unittests/Support/VirtualOutputFileTest.cpp (+342)
  • (added) llvm/unittests/Support/raw_ostream_proxy_test.cpp (+219)
diff --git a/clang/include/clang/Frontend/CompilerInstance.h b/clang/include/clang/Frontend/CompilerInstance.h
index d26a452cf94cc3b..9295b2bee115cc0 100644
--- a/clang/include/clang/Frontend/CompilerInstance.h
+++ b/clang/include/clang/Frontend/CompilerInstance.h
@@ -24,6 +24,7 @@
 #include "llvm/ADT/StringRef.h"
 #include "llvm/Support/BuryPointer.h"
 #include "llvm/Support/FileSystem.h"
+#include "llvm/Support/VirtualOutputBackend.h"
 #include <cassert>
 #include <list>
 #include <memory>
@@ -92,6 +93,9 @@ class CompilerInstance : public ModuleLoader {
   /// The file manager.
   IntrusiveRefCntPtr<FileManager> FileMgr;
 
+  /// The output context.
+  IntrusiveRefCntPtr<llvm::vfs::OutputBackend> TheOutputBackend;
+
   /// The source manager.
   IntrusiveRefCntPtr<SourceManager> SourceMgr;
 
@@ -164,22 +168,8 @@ class CompilerInstance : public ModuleLoader {
   /// The stream for verbose output.
   raw_ostream *VerboseOutputStream = &llvm::errs();
 
-  /// Holds information about the output file.
-  ///
-  /// If TempFilename is not empty we must rename it to Filename at the end.
-  /// TempFilename may be empty and Filename non-empty if creating the temporary
-  /// failed.
-  struct OutputFile {
-    std::string Filename;
-    std::optional<llvm::sys::fs::TempFile> File;
-
-    OutputFile(std::string filename,
-               std::optional<llvm::sys::fs::TempFile> file)
-        : Filename(std::move(filename)), File(std::move(file)) {}
-  };
-
   /// The list of active output files.
-  std::list<OutputFile> OutputFiles;
+  std::list<llvm::vfs::OutputFile> OutputFiles;
 
   /// Force an output buffer.
   std::unique_ptr<llvm::raw_pwrite_stream> OutputStream;
@@ -430,6 +420,22 @@ class CompilerInstance : public ModuleLoader {
 
   /// Replace the current file manager and virtual file system.
   void setFileManager(FileManager *Value);
+  /// @name Output Backend.
+  /// {
+
+  /// Set the output backend.
+  void
+  setOutputBackend(IntrusiveRefCntPtr<llvm::vfs::OutputBackend> NewOutputs);
+
+  /// Create an output manager.
+  void createOutputBackend();
+
+  bool hasOutputBackend() const { return bool(TheOutputBackend); }
+
+  llvm::vfs::OutputBackend &getOutputBackend();
+  llvm::vfs::OutputBackend &getOrCreateOutputBackend();
+
+  /// }
 
   /// @}
   /// @name Source Manager
diff --git a/clang/lib/Frontend/CompilerInstance.cpp b/clang/lib/Frontend/CompilerInstance.cpp
index d18371f21a9d86e..f00eb145d8d2e4f 100644
--- a/clang/lib/Frontend/CompilerInstance.cpp
+++ b/clang/lib/Frontend/CompilerInstance.cpp
@@ -54,6 +54,8 @@
 #include "llvm/Support/Signals.h"
 #include "llvm/Support/TimeProfiler.h"
 #include "llvm/Support/Timer.h"
+#include "llvm/Support/VirtualOutputBackends.h"
+#include "llvm/Support/VirtualOutputError.h"
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/TargetParser/Host.h"
 #include <optional>
@@ -516,6 +518,10 @@ void CompilerInstance::createPreprocessor(TranslationUnitKind TUKind) {
     collectVFSEntries(*this, ModuleDepCollector);
   }
 
+  // Modules need an output manager.
+  if (!hasOutputBackend())
+    createOutputBackend();
+
   for (auto &Listener : DependencyCollectors)
     Listener->attachToPreprocessor(*PP);
 
@@ -759,32 +765,19 @@ void CompilerInstance::createSema(TranslationUnitKind TUKind,
 void CompilerInstance::clearOutputFiles(bool EraseFiles) {
   // The ASTConsumer can own streams that write to the output files.
   assert(!hasASTConsumer() && "ASTConsumer should be reset");
-  // Ignore errors that occur when trying to discard the temp file.
-  for (OutputFile &OF : OutputFiles) {
-    if (EraseFiles) {
-      if (OF.File)
-        consumeError(OF.File->discard());
-      if (!OF.Filename.empty())
-        llvm::sys::fs::remove(OF.Filename);
-      continue;
-    }
-
-    if (!OF.File)
-      continue;
-
-    if (OF.File->TmpName.empty()) {
-      consumeError(OF.File->discard());
-      continue;
-    }
-
-    llvm::Error E = OF.File->keep(OF.Filename);
-    if (!E)
-      continue;
-
-    getDiagnostics().Report(diag::err_unable_to_rename_temp)
-        << OF.File->TmpName << OF.Filename << std::move(E);
-
-    llvm::sys::fs::remove(OF.File->TmpName);
+  if (!EraseFiles) {
+    for (auto &O : OutputFiles)
+      llvm::handleAllErrors(
+          O.keep(),
+          [&](const llvm::vfs::TempFileOutputError &E) {
+            getDiagnostics().Report(diag::err_unable_to_rename_temp)
+                << E.getTempPath() << E.getOutputPath()
+                << E.convertToErrorCode().message();
+          },
+          [&](const llvm::vfs::OutputError &E) {
+            getDiagnostics().Report(diag::err_fe_unable_to_open_output)
+                << E.getOutputPath() << E.convertToErrorCode().message();
+          });
   }
   OutputFiles.clear();
   if (DeleteBuiltModules) {
@@ -818,6 +811,28 @@ std::unique_ptr<raw_pwrite_stream> CompilerInstance::createNullOutputFile() {
   return std::make_unique<llvm::raw_null_ostream>();
 }
 
+void CompilerInstance::setOutputBackend(
+    IntrusiveRefCntPtr<llvm::vfs::OutputBackend> NewOutputs) {
+  assert(!TheOutputBackend && "Already has an output manager");
+  TheOutputBackend = std::move(NewOutputs);
+}
+
+void CompilerInstance::createOutputBackend() {
+  assert(!TheOutputBackend && "Already has an output manager");
+  TheOutputBackend = llvm::makeIntrusiveRefCnt<llvm::vfs::OnDiskOutputBackend>();
+}
+
+llvm::vfs::OutputBackend &CompilerInstance::getOutputBackend() {
+  assert(TheOutputBackend);
+  return *TheOutputBackend;
+}
+
+llvm::vfs::OutputBackend &CompilerInstance::getOrCreateOutputBackend() {
+  if (!hasOutputBackend())
+    createOutputBackend();
+  return getOutputBackend();
+}
+
 std::unique_ptr<raw_pwrite_stream>
 CompilerInstance::createOutputFile(StringRef OutputPath, bool Binary,
                                    bool RemoveFileOnSignal, bool UseTemporary,
@@ -852,98 +867,20 @@ CompilerInstance::createOutputFileImpl(StringRef OutputPath, bool Binary,
     OutputPath = *AbsPath;
   }
 
-  std::unique_ptr<llvm::raw_fd_ostream> OS;
-  std::optional<StringRef> OSFile;
-
-  if (UseTemporary) {
-    if (OutputPath == "-")
-      UseTemporary = false;
-    else {
-      llvm::sys::fs::file_status Status;
-      llvm::sys::fs::status(OutputPath, Status);
-      if (llvm::sys::fs::exists(Status)) {
-        // Fail early if we can't write to the final destination.
-        if (!llvm::sys::fs::can_write(OutputPath))
-          return llvm::errorCodeToError(
-              make_error_code(llvm::errc::operation_not_permitted));
-
-        // Don't use a temporary if the output is a special file. This handles
-        // things like '-o /dev/null'
-        if (!llvm::sys::fs::is_regular_file(Status))
-          UseTemporary = false;
-      }
-    }
-  }
-
-  std::optional<llvm::sys::fs::TempFile> Temp;
-  if (UseTemporary) {
-    // Create a temporary file.
-    // Insert -%%%%%%%% before the extension (if any), and because some tools
-    // (noticeable, clang's own GlobalModuleIndex.cpp) glob for build
-    // artifacts, also append .tmp.
-    StringRef OutputExtension = llvm::sys::path::extension(OutputPath);
-    SmallString<128> TempPath =
-        StringRef(OutputPath).drop_back(OutputExtension.size());
-    TempPath += "-%%%%%%%%";
-    TempPath += OutputExtension;
-    TempPath += ".tmp";
-    llvm::sys::fs::OpenFlags BinaryFlags =
-        Binary ? llvm::sys::fs::OF_None : llvm::sys::fs::OF_Text;
-    Expected<llvm::sys::fs::TempFile> ExpectedFile =
-        llvm::sys::fs::TempFile::create(
-            TempPath, llvm::sys::fs::all_read | llvm::sys::fs::all_write,
-            BinaryFlags);
-
-    llvm::Error E = handleErrors(
-        ExpectedFile.takeError(), [&](const llvm::ECError &E) -> llvm::Error {
-          std::error_code EC = E.convertToErrorCode();
-          if (CreateMissingDirectories &&
-              EC == llvm::errc::no_such_file_or_directory) {
-            StringRef Parent = llvm::sys::path::parent_path(OutputPath);
-            EC = llvm::sys::fs::create_directories(Parent);
-            if (!EC) {
-              ExpectedFile = llvm::sys::fs::TempFile::create(
-                  TempPath, llvm::sys::fs::all_read | llvm::sys::fs::all_write,
-                  BinaryFlags);
-              if (!ExpectedFile)
-                return llvm::errorCodeToError(
-                    llvm::errc::no_such_file_or_directory);
-            }
-          }
-          return llvm::errorCodeToError(EC);
-        });
-
-    if (E) {
-      consumeError(std::move(E));
-    } else {
-      Temp = std::move(ExpectedFile.get());
-      OS.reset(new llvm::raw_fd_ostream(Temp->FD, /*shouldClose=*/false));
-      OSFile = Temp->TmpName;
-    }
-    // If we failed to create the temporary, fallback to writing to the file
-    // directly. This handles the corner case where we cannot write to the
-    // directory, but can write to the file.
-  }
-
-  if (!OS) {
-    OSFile = OutputPath;
-    std::error_code EC;
-    OS.reset(new llvm::raw_fd_ostream(
-        *OSFile, EC,
-        (Binary ? llvm::sys::fs::OF_None : llvm::sys::fs::OF_TextWithCRLF)));
-    if (EC)
-      return llvm::errorCodeToError(EC);
-  }
-
-  // Add the output file -- but don't try to remove "-", since this means we are
-  // using stdin.
-  OutputFiles.emplace_back(((OutputPath != "-") ? OutputPath : "").str(),
-                           std::move(Temp));
-
-  if (!Binary || OS->supportsSeeking())
-    return std::move(OS);
-
-  return std::make_unique<llvm::buffer_unique_ostream>(std::move(OS));
+  using namespace llvm::vfs;
+  Expected<OutputFile> O = getOrCreateOutputBackend().createFile(
+      OutputPath,
+      OutputConfig()
+          .setTextWithCRLF(!Binary)
+          .setDiscardOnSignal(RemoveFileOnSignal)
+          .setAtomicWrite(UseTemporary)
+          .setImplyCreateDirectories(UseTemporary && CreateMissingDirectories));
+  if (!O)
+    return O.takeError();
+
+  O->discardOnDestroy([](llvm::Error E) { consumeError(std::move(E)); });
+  OutputFiles.push_back(std::move(*O));
+  return OutputFiles.back().createProxy();
 }
 
 // Initialization Utilities
diff --git a/llvm/include/llvm/Support/HashingOutputBackend.h b/llvm/include/llvm/Support/HashingOutputBackend.h
new file mode 100644
index 000000000000000..d2e79663f552666
--- /dev/null
+++ b/llvm/include/llvm/Support/HashingOutputBackend.h
@@ -0,0 +1,112 @@
+//===- HashingOutputBackends.h - Hashing output backends --------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_HASHINGOUTPUTBACKEND_H
+#define LLVM_SUPPORT_HASHINGOUTPUTBACKEND_H
+
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/ADT/StringMap.h"
+#include "llvm/Support/Endian.h"
+#include "llvm/Support/HashBuilder.h"
+#include "llvm/Support/VirtualOutputBackend.h"
+#include "llvm/Support/VirtualOutputConfig.h"
+#include "llvm/Support/raw_ostream.h"
+
+namespace llvm::vfs {
+
+/// raw_pwrite_stream that writes to a hasher.
+template <typename HasherT>
+class HashingStream : public llvm::raw_pwrite_stream {
+private:
+  SmallVector<char> Buffer;
+  raw_svector_ostream OS;
+
+  using HashBuilderT = HashBuilder<HasherT, support::endianness::native>;
+  HashBuilderT Builder;
+
+  void write_impl(const char *Ptr, size_t Size) override {
+    OS.write(Ptr, Size);
+  }
+
+  void pwrite_impl(const char *Ptr, size_t Size, uint64_t Offset) override {
+    OS.pwrite(Ptr, Size, Offset);
+  }
+
+  uint64_t current_pos() const override { return OS.str().size(); }
+
+public:
+  HashingStream() : OS(Buffer) { SetUnbuffered(); }
+
+  auto final() {
+    Builder.update(OS.str());
+    return Builder.final();
+  }
+};
+
+template <typename HasherT> class HashingOutputFile;
+
+/// An output backend that only generates the hash for outputs.
+template <typename HasherT> class HashingOutputBackend : public OutputBackend {
+private:
+  friend class HashingOutputFile<HasherT>;
+  void addOutputFile(StringRef Path, StringRef Hash) {
+    OutputHashes[Path] = std::string(Hash);
+  }
+
+protected:
+  IntrusiveRefCntPtr<OutputBackend> cloneImpl() const override {
+    return const_cast<HashingOutputBackend<HasherT> *>(this);
+  }
+
+  Expected<std::unique_ptr<OutputFileImpl>>
+  createFileImpl(StringRef Path, std::optional<OutputConfig> Config) override {
+    return std::make_unique<HashingOutputFile<HasherT>>(Path, *this);
+  }
+
+public:
+  /// Iterator for all the output file names.
+  auto outputFiles() const { return OutputHashes.keys(); }
+
+  /// Get hash value for the output files in hex representation.
+  /// Return None if the requested path is not generated.
+  std::optional<std::string> getHashValueForFile(StringRef Path) {
+    auto F = OutputHashes.find(Path);
+    if (F == OutputHashes.end())
+      return std::nullopt;
+    return toHex(F->second);
+  }
+
+private:
+  StringMap<std::string> OutputHashes;
+};
+
+/// HashingOutputFile.
+template <typename HasherT>
+class HashingOutputFile final : public OutputFileImpl {
+public:
+  Error keep() override {
+    auto Result = OS.final();
+    Backend.addOutputFile(OutputPath, toStringRef(Result));
+    return Error::success();
+  }
+  Error discard() override { return Error::success(); }
+  raw_pwrite_stream &getOS() override { return OS; }
+
+  HashingOutputFile(StringRef OutputPath,
+                    HashingOutputBackend<HasherT> &Backend)
+      : OutputPath(OutputPath.str()), Backend(Backend) {}
+
+private:
+  const std::string OutputPath;
+  HashingStream<HasherT> OS;
+  HashingOutputBackend<HasherT> &Backend;
+};
+
+} // namespace llvm::vfs
+
+#endif // LLVM_SUPPORT_HASHINGOUTPUTBACKEND_H
diff --git a/llvm/include/llvm/Support/VirtualOutputBackend.h b/llvm/include/llvm/Support/VirtualOutputBackend.h
new file mode 100644
index 000000000000000..2328252c7054fcb
--- /dev/null
+++ b/llvm/include/llvm/Support/VirtualOutputBackend.h
@@ -0,0 +1,62 @@
+//===- VirtualOutputBackend.h - Output virtualization -----------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_VIRTUALOUTPUTBACKEND_H
+#define LLVM_SUPPORT_VIRTUALOUTPUTBACKEND_H
+
+#include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/Support/Error.h"
+#include "llvm/Support/VirtualOutputConfig.h"
+#include "llvm/Support/VirtualOutputFile.h"
+
+namespace llvm::vfs {
+
+/// Interface for virtualized outputs.
+///
+/// If virtual functions are added here, also add them to \a
+/// ProxyOutputBackend.
+class OutputBackend : public RefCountedBase<OutputBackend> {
+  virtual void anchor();
+
+public:
+  /// Get a backend that points to the same destination as this one but that
+  /// has independent settings.
+  ///
+  /// Not thread-safe, but all operations are thread-safe when performed on
+  /// separate clones of the same backend.
+  IntrusiveRefCntPtr<OutputBackend> clone() const { return cloneImpl(); }
+
+  /// Create a file. If \p Config is \c std::nullopt, uses the backend's default
+  /// OutputConfig (may match \a OutputConfig::OutputConfig(), or may
+  /// have been customized).
+  ///
+  /// Thread-safe.
+  Expected<OutputFile>
+  createFile(const Twine &Path,
+             std::optional<OutputConfig> Config = std::nullopt);
+
+protected:
+  /// Must be thread-safe. Virtual function has a different name than \a
+  /// clone() so that implementations can override the return value.
+  virtual IntrusiveRefCntPtr<OutputBackend> cloneImpl() const = 0;
+
+  /// Create a file for \p Path. Must be thread-safe.
+  ///
+  /// \pre \p Config is valid or std::nullopt.
+  virtual Expected<std::unique_ptr<OutputFileImpl>>
+  createFileImpl(StringRef Path, std::optional<OutputConfig> Config) = 0;
+
+  OutputBackend() = default;
+
+public:
+  virtual ~OutputBackend() = default;
+};
+
+} // namespace llvm::vfs
+
+#endif // LLVM_SUPPORT_VIRTUALOUTPUTBACKEND_H
diff --git a/llvm/include/llvm/Support/VirtualOutputBackends.h b/llvm/include/llvm/Support/VirtualOutputBackends.h
new file mode 100644
index 000000000000000..6f702000d77b3e4
--- /dev/null
+++ b/llvm/include/llvm/Support/VirtualOutputBackends.h
@@ -0,0 +1,110 @@
+//===- VirtualOutputBackends.h - Virtual output backends --------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_VIRTUALOUTPUTBACKENDS_H
+#define LLVM_SUPPORT_VIRTUALOUTPUTBACKENDS_H
+
+#include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/Support/VirtualOutputBackend.h"
+#include "llvm/Support/VirtualOutputConfig.h"
+
+namespace llvm::vfs {
+
+/// Create a backend that ignores all output.
+IntrusiveRefCntPtr<OutputBackend> makeNullOutputBackend();
+
+/// Make a backend where \a OutputBackend::createFile() forwards to
+/// \p UnderlyingBackend when \p Filter is true, and otherwise returns a
+/// \a NullOutput.
+IntrusiveRefCntPtr<OutputBackend> makeFilteringOutputBackend(
+    IntrusiveRefCntPtr<OutputBackend> UnderlyingBackend,
+    std::function<bool(StringRef, std::optional<OutputConfig>)> Filter);
+
+/// Create a backend that forwards \a OutputBackend::createFile() to both \p
+/// Backend1 and \p Backend2 and sends content to both places.
+IntrusiveRefCntPtr<OutputBackend>
+makeMirroringOutputBackend(IntrusiveRefCntPtr<OutputBackend> Backend1,
+                           IntrusiveRefCntPtr<OutputBackend> Backend2);
+
+/// A helper class for proxying another backend, with the default
+/// implementation to forward to the underlying backend.
+class ProxyOutputBackend : public OutputBackend {
+  void anchor() override;
+
+protected:
+  // Require subclass to implement cloneImpl().
+  //
+  // IntrusiveRefCntPtr<OutputBackend> cloneImpl() const override;
+
+  Expected<std::unique_ptr<OutputFileImpl>>
+  createFileImpl(StringRef Path, std::optional<OutputConfig> Config) override {
+    OutputFile File;
+    if (Error E = UnderlyingBackend->createFile(Path, Config).moveInto(File))
+      return std::move(E);
+    return File.takeImpl();
+  }
+
+  OutputBackend &getUnderlyingBackend() const { return *UnderlyingBackend; }
+
+public:
+  ProxyOutputBackend(IntrusiveRefCntPtr<OutputBackend> UnderlyingBackend)
+      : UnderlyingBackend(std::move(UnderlyingBackend)) {
+    assert(this->UnderlyingBackend && "Expected non-null backend");
+  }
+
+private:
+  IntrusiveRefCntPtr<OutputBackend> UnderlyingBackend;
+};
+
+/// An output backend that creates files on disk, wrapping APIs in sys::fs.
+class OnDiskOutputBackend : public OutputBackend {
+  void anchor() override;
+
+protected:
+  IntrusiveRefCntPtr<OutputBackend> cloneImpl() const override {
+    return clone();
+  }
+
+  Expected<std::unique_ptr<OutputFileImpl>>
+  createFileImpl(StringRef Path, std::optional<OutputConfig> Config) override;
+
+public:
+  /// Resolve an absolute path.
+  Error makeAbsolute(SmallVectorImpl<char> &Path) const;
+
+  /// On disk output settings.
+  struct OutputSettings {
+    /// Register output files to be deleted if a signal is received. Also
+    /// disabled for outputs with \a OutputConfig::getNoDiscardOnSignal().
+    bool DisableRemoveOnSignal = false;
+
+    /// Disable temporary files. Also disabled for outputs with \a
+    /// OutputConfig::getNoAtomicWrite().
+    bool DisableTemporaries = false;
+
+    // Default configuration for this backend.
+    OutputConfig DefaultConfig;
+  };
+
+  IntrusiveRefCntPtr<OnDiskOutputBackend> clone() const {
+    auto Clone = makeIntrusiveRefCnt<OnDiskOutputBackend>();
+    Clone->Settings = Settings;
+    return Clone;
+  }
+
+  OnDiskOutputBackend() = default;
+
+  /// Settings for this backend.
+  ///
+  /// Access is not thread-safe.
+  OutputSettings Settings;
+};
+
+} // namespace llvm::vfs
+
+#endif // LLVM_SUPPORT_VIRTUALOUTPUTBACKENDS_H
diff --git a/llvm/include/llvm/Support/VirtualOutputConfig.def b/llvm/include/llvm/Support/Vi...
[truncated]

Add proxies classes for `raw_ostream` and `raw_pwrite_stream` called
`raw_ostream_proxy` and `raw_pwrite_stream_proxy`. Add adaptor classes,
`raw_ostream_proxy_adaptor<>` and `raw_pwrite_stream_proxy_adaptor<>`,
to allow subclasses to use a different parent class than `raw_ostream`
or `raw_pwrite_stream`.

The adaptors are used by a future patch to help a subclass of
`llvm::vfs::OutputFile`, an abstract subclass of `raw_pwrite_stream`, to
proxy a `raw_fd_ostream`.

Patched by dexonsmith.

Differential Revision: https://reviews.llvm.org/D133503
… outputs

Add OutputBackend and OutputFile to the `llvm::vfs` namespace for
virtualizing compiler outputs. This is intended for use in Clang,

The headers are:

- llvm/Support/VirtualOutputConfig.h
- llvm/Support/VirtualOutputError.h
- llvm/Support/VirtualOutputFile.h
- llvm/Support/VirtualOutputBackend.h

OutputFile is moveable and owns an OutputFileImpl, which is provided by
the derived OutputBackend.

- OutputFileImpl::keep() and OutputFileImpl::discard() should keep or
  discard the output.  OutputFile guarantees that exactly one of these
  will be called before destruction.
- OutputFile::keep() and OutputFile::discard() wrap OutputFileImpl
  and catch usage errors such as double-close.
- OutputFile::discardOnDestroy() installs an error handler for the
  destructor to use if the file is still open. The handler will be
  called if discard() fails.
- OutputFile::~OutputFile() calls report_fatal_error() if none of
  keep(), discard(), or discardOnDestroy() has been called. It still
  calls OutputFileImpl::discard().
- getOS() returns the wrapped raw_pwrite_stream. For convenience,
  OutputFile has an implicit conversion to `raw_ostream` and
  `raw_ostream &operator<<(OutputFile&, T&&)`.

OutputBackend can be stored in IntrusiveRefCntPtr.

- Most operations are thread-safe.
- clone() returns a backend that targets the same destination.
  All operations are thread-safe when done on different clones.
- createFile() takes a path and an OutputConfig (see below) and returns
  an OutputFile. Backends implement createFileImpl().

OutputConfig has flags to configure the output. Backends may ignore or
override flags that aren't relevant or implementable.

- The initial flags are:
    - AtomicWrite: whether the output should appear atomically (e.g., by
      using a temporary file and renaming it).
    - CrashCleanup: whether the output should be cleaned up if there's a
      crash (e.g., with RemoveFileOnSignal).
    - ImplyCreateDirectories: whether to implicitly create missing
      directories in the path to the file.
    - Text: matches sys::fs::OF_Text.
    - CRLF: matches sys::fs::OF_CRLF.
    - Append: matches sys::fs::OF_Append and can use with AtomicWrite
      for atomic append.
- Each "Flag" has `setFlag(bool)` and `bool getFlag()` and shortcuts
  `setFlag()` and `setNoFlag()`. The setters are `constexpr` and return
  `OutputConfig&` to make it easy to declare a default value for a filed
  in a class or struct.
- Setters and getters for Binary and TextWithCRLF are derived from Text
  and CRLF. For convenience, sys::fs::OpenFlags can be passed
  directly to setOpenFlags().

This patch intentionally lacks a number of important features that have
been left for follow-ups:

- Set a (virtual) current working directory.
- Create a directory.
- Create a file or directory with a unique name (avoiding collisions
  with existing filenames).

Patch originally by dexonsmith
Adopt new virtual output backend in CompilerInstance.
@github-actions
Copy link

github-actions bot commented Oct 6, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@cachemeifyoucan cachemeifyoucan force-pushed the virtualize-llvm-tools-output branch from 58763b6 to edc39ba Compare October 6, 2023 21:06
size_t PreferredBufferSize;
};

/// Adaptor to create a stream class that proxies another \a raw_ostream.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds more like a decorator than an adapter, maybe?

It's a type T that wraps another type T? (but it has a template argument, so maybe it's meant to be able to wrap some other stream types)

Perhaps some more words about how/why/what this tool provides would be useful?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal for this class is to bridge some current usages of raw_ostream output to use VirtualOutputBackend/VirtualOutputFile. It would make more sense if look at it together with the 2nd commit.

After virtualizing the output, we have VirtualOutputFile that represents the output from llvm tools and the file can be keep() or discard() with proper error handling, etc. VirtualOutputFile should be owning the underlying output stream, for example, raw_fd_ostream. On the other hand, there are certain function/tool want to own its output stream, and a proxy to the actual underlying output stream can be created from VirtualOutputFile and the proxy can be owned by other part of the code.

Of course, when the VirtualOutputFile going to close the owning stream, it will make sure all proxies will be close and further write to the proxy will cause crash. This actually caught some bugs in our tools.

@rnk rnk requested a review from zmodem November 7, 2023 00:10
Copy link
Collaborator

@zmodem zmodem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started reviewing the first commit, but wanted to ask some high-level questions which will make it easier to review:

The PR description doesn't say so, but the motivation for this is the CAS work, right?

LLVM already has a "virtual filesystem", and I thought at least for Clang most file operations go through that already. So what gaps are there that need to be filled for the needs of CAS?

/// template instantions.
class raw_ostream_proxy_adaptor_base {
protected:
raw_ostream_proxy_adaptor_base() = delete;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to be explicitly deleted when you're providing a user-defined constructor further down?

/// Stop proxying the stream, taking the derived object by reference as \p
/// ThisProxyOS. Updates \p ThisProxyOS to stop buffering before setting \a
/// OS to \c nullptr, ensuring that future writes crash immediately.
void resetProxiedOS(raw_ostream &ThisProxyOS) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter is confusing to me. Is ThisProxyOS always going to be == OS here? If not, how are they related? If yes, why take the parameter?

namespace llvm {

/// Common bits for \a raw_ostream_proxy_adaptor<>, split out to dedup in
/// template instantions.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect a lot of instantiations, to make this extra level of abstraction worth it?

}
};

/// Non-owning proxy for a \a raw_ostream. Enables passing a stream into of an
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "into of" (also for raw_pwrite_stream_proxy).

public:
// Choose a strange buffer size to ensure it doesn't collide with the default
// on \a raw_ostream.
constexpr static const size_t PreferredBufferSize = 63;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The const seems redundant, and I think static constexpr is the more common ordering.

bool IsDisplayed = false;
};

constexpr const size_t BufferedNoPwriteSmallVectorStream::PreferredBufferSize;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant const

EXPECT_EQ(0u, ProxyOS.GetBufferSize());
EXPECT_FALSE(ProxyOS.hasProxiedOS());

#if GTEST_HAS_DEATH_TEST
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be guarded by asserts also being enabled?

@cachemeifyoucan
Copy link
Collaborator Author

@zmodem Sorry I totally missed you comments. I thought no one was reviewing so I kind of abandoned the process.

Recreate the PR in a stack for new reviews. All of your comments should be addressed:

3713985 Support: Add proxies for raw_ostream and raw_pwrite_stream
  ✨  Created new Pull Request #113362: https://github.com/llvm/llvm-project/pull/113362
3587264 Support: Add vfs::OutputBackend and OutputFile to virtualize compiler outputs
  ✨  Created new Pull Request #113363: https://github.com/llvm/llvm-project/pull/113363
c59aacd Frontend: Adopt llvm::vfs::OutputBackend in CompilerInstance
  ✨  Created new Pull Request #113364: https://github.com/llvm/llvm-project/pull/113364

@cachemeifyoucan
Copy link
Collaborator Author

Close as moving to new stacked PR for easy review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category llvm:support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants