Skip to content

Commit 219a7c4

Browse files
authored
Merge pull request #29 from crutcher/crutcher-improved-readme
Improved landing README.md.
2 parents 733c2ea + 7ed8a00 commit 219a7c4

File tree

1 file changed

+82
-49
lines changed

1 file changed

+82
-49
lines changed

README.md

+82-49
Original file line numberDiff line numberDiff line change
@@ -1,66 +1,99 @@
1-
# Tensor Tapestry Compiler Suite
1+
# Tapestry Tensor Expression Compiler Suite
2+
3+
<center><b>"It's Just One Ocean"<br/>-Crutcher</b></center>
4+
5+
## Overview
6+
7+
<img style="float: right; width: 20%; margin: 10px" alt="linear.relu.4x" src="docs/media/linear.relu.4x.ortho.jpg"/>
8+
9+
**Tapestry** is an experimental tensor expression compiler framework.
10+
11+
Modern GPU-filled datacenters contain thousands of nodes with 8+ GPUs each, and are capable of
12+
performing trillions of floating point operations per second. The goal of **Tapestry** is to unlock
13+
the full potential of modern GPU-filled datacenters, by providing a foundational programming
14+
environment for scalable, optimized, massively multi-GPU tensor programs.
15+
16+
Tensor programs underlie all deep-network based AI, and all finite element numerical simulations.
17+
These include numerical weather and fluid simulations, protein folding and drug discovery
18+
simulations, quantum chemistry simulations, financial simulations, and material design and
19+
manufacturing simulations. Modern tensor programming environments are designed to maximize
20+
productivity of developers working on single-GPU workstations, and struggle to express programs
21+
which can be scheduled across even a few GPUs. Some of these frameworks do have solutions to scaling
22+
up limited workloads, but no general-purpose solutions exist for scaling up arbitrary tensor
23+
programs.
24+
25+
Multiple existing companies operate with >$1B/year annual hardware budgets for these simulations,
26+
somewhere in the dozens of $B/year are being spent worldwide on these calculations today.
27+
28+
Though it is difficult to predict in advance the speedups of a new optimizing compiler, it is the
29+
case that, due to the semantics of their programming models, the vast majority of existing tensor
30+
applications are run with no meaningful structural optimizations; the programs are run directly as
31+
human engineers have written them, with no further optimizations. This is akin to directly executing
32+
a SQL query without any query planner or optimizer. The potential wins in efficiency for existing
33+
applications from an optimizing compiler are therefore large; conservatively in the 30% range; but
34+
for some applications, the potential is dramatically larger.
35+
36+
Irrespective of efficiency wins, the potential for new applications is tremendous; existing
37+
applications are limited by the interconnect scheduling and manual design of the programs, and
38+
removing these limitations will enable new applications which are not possible today.
39+
40+
At the current time, **Tapestry** is sitting upon years of development towards a solid theoretical
41+
foundation, of shardable, composable, and re-writable polyhedral model tensor block algebra
42+
operation expressions on an extensible compiler framework. The work is focused on exploiting this
43+
mathematical foundation towards a practical compiler suite. Expectations are that the project needs
44+
1-3 aggregate engineer-years of work to reach a state where it can be used to compile real-world
45+
applications.
46+
47+
This is a big-pull project; the payoffs are huge, but the work required to climb from theory back to
48+
practical parity with existing frameworks is substantial. There are many opportunities for
49+
development applications along the way, empowered by that solid theoretical foundation. We are
50+
seeking contributors, reviewers, and enthusiasts to help bring this project to life sooner. Funding
51+
support, or safe-harbor in a larger organization, would also be very helpful.
252

3-
See the full [Tapestry Documentation](docs/README.md) for detailed information.
4-
5-
Join the Discord Server:
53+
## Getting Started
654

7-
[![Banner](https://invidget.switchblade.xyz/PNpSrFMeUb?theme=light)](https://discord.gg/PNpSrFMeUb)
55+
### Read the Documentation
856

9-
**Tapestry** is an experimental tensor expression optimizing compiler suite.
57+
The full [Tapestry Documentation](docs/README.md) provides much more detailed information about the
58+
project's motivation, goals, design, and implementation.
1059

11-
It exists to make it easy to optimize applications (such as AI) to maximally exploit both
12-
datacenters full of GPUs, and integrated FPGA stacks.
60+
### Join the Discord
1361

14-
The goal of **Tapestry** is to provide an ecosystem for a high-performance stochastic pareto-front
15-
optimizer for distributed tensor expressions, targeting optimizations which are permitted to search
16-
for extended time on a large number of machines.
62+
[![Banner](https://invidget.switchblade.xyz/PNpSrFMeUb?theme=light)](https://discord.gg/PNpSrFMeUb)
1763

18-
Here are examples showing a **Linear**/**ReLU** pipeline, with and without sub-block sharding;
19-
demonstrating the potential for sub-shard operation fusion:
64+
If you have any interest in the project, please join the Discord server. We are actively looking for
65+
reviewers, contributors, fans, theorists, and developers and would love to have you involved.
2066

21-
<table cellborder="0">
22-
<tr>
23-
<td>
24-
<div style="width: 100%; margin: auto">
25-
<img alt="linear.relu" src="docs/media/linear.relu.ortho.jpg"/>
26-
</div>
27-
</td>
28-
<td>
29-
<div style="width: 100%; margin: auto">
30-
<img alt="linear.relu.4x" src="docs/media/linear.relu.4x.ortho.jpg"/>
31-
</div>
32-
</td>
33-
</tr>
34-
</table>
67+
A huge portion of bringing this project to life is building a community of enthusiasts and experts
68+
who can help guide the project, not only through theory and code; but also through iterative
69+
development of the documentation, making the project accessible to wider audiences.
3570

36-
## Contributing
71+
We are particularly interested in:
3772

38-
I'm actively looking for contributors to help with building or reviewing the project.
73+
- document reviewers
74+
- project managers
75+
- programmers
76+
- compiler theorists
3977

40-
If you'd like to get involved, please post any questions in the project
41-
[Discussions](https://github.com/crutcher/loom/discussions) board, or open an issue.
78+
### File an Issue / Bug
4279

43-
We could create a Discord server; if we got enough traction.
80+
We are actively looking for feedback on the project. If you have any issues, please file a bug on
81+
the [Issues](https://github.com/crutcher/tapestry/issues) page.
4482

45-
I'm particularly interested in contributors with experience in the following areas:
83+
### Join the Discussions
4684

47-
- maven lifecycle / package publishing
48-
- technical documentation / editing
49-
- compiler design
50-
- tensor algebra
51-
- optimization theory
52-
- graph transformations
53-
- graph representation
54-
- distributed computing
55-
- graph visualization
85+
If you have longer-form concerns to discuss, please post them in the project
86+
[Discussions](https://github.com/crutcher/loom/discussions) board.
5687

57-
## Getting Started
88+
## Setup / Contributing Code
5889

59-
In the current stage of development, **loom** produces no tool targets; and exists solely as a
60-
collection of libraries and tests.
90+
If you are interested in running the existing test suites, or contributing code, you'll need to
91+
clone the repository and setup the development environment.
6192

62-
It **should** setup cleanly in any modern development environment; but full external dependencies
63-
are not yet documented.
93+
The project is a JDK 21 multi-module Maven/Java project, and should be setup in any modern
94+
development IDE (JetBrains, VSC, etc).
6495

65-
Documenting missing dependencies is a high priority and setup instructions is another high priority
66-
which contributors could help with.
96+
That said, the project has been developed by one person thus far, and may have some missing
97+
dependencies or undocumented requirements. If you run into any issues, please join the Discord or
98+
file a bug (or both!) with as much information as possible, and I'll prioritize fixing the cause or
99+
documenting the missing dependency.

0 commit comments

Comments
 (0)