Skip to content

Commit 712ffe7

Browse files
authored
Merge pull request #2 from AIComputing101/coketaste/docker
Fix Dockerfiles for CUDA and AMD
2 parents 6e97628 + 208cf97 commit 712ffe7

File tree

8 files changed

+270
-341
lines changed

8 files changed

+270
-341
lines changed

.github/workflows/markdown-link-check-config.json

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,17 @@
77
"pattern": "^https://github.com/yourusername/"
88
}
99
],
10-
"aliveStatusCodes": [200, 206, 999],
11-
"timeout": "10s",
10+
"httpHeaders": [
11+
{
12+
"urls": ["https://rocmdocs.amd.com/"],
13+
"headers": {
14+
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
15+
}
16+
}
17+
],
18+
"timeout": "15s",
1219
"retryOn429": true,
13-
"retryCount": 5,
20+
"retryCount": 3,
1421
"fallbackRetryDelay": "30s",
15-
"aliveStatusCodes": [200, 206]
22+
"aliveStatusCodes": [200, 206, 999]
1623
}

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111

1212
*From beginner fundamentals to production-ready optimization techniques*
1313

14-
**Quick Navigation:** [🚀 Quick Start](#-quick-start)[📚 Modules](#-modules)[🐳 Docker Setup](#-docker-development)[📖 Documentation](SUMMARY.md)[🤝 Contributing](CONTRIBUTING.md)
14+
**Quick Navigation:** [🚀 Quick Start](#-quick-start)[📚 Modules](#-modules)[🐳 Docker Setup](#-docker-development)[🤝 Contributing](CONTRIBUTING.md)
1515

1616
---
1717

@@ -114,7 +114,7 @@ cd modules/module1/examples
114114

115115
**📈 Progressive Learning Path: 70+ Examples • 50+ Hours • Beginner to Expert**
116116

117-
**[📖 View Detailed Curriculum](SUMMARY.md)**
117+
**[ View Learning Modules](modules/)**
118118

119119
## 🛠️ Prerequisites
120120

@@ -306,13 +306,13 @@ make check-hip
306306
./docker/scripts/build.sh --clean --all
307307
```
308308

309-
**[📖 Full Troubleshooting Guide ](docs/troubleshooting.md)**
309+
**[� Need Help? Check Common Issues ](README.md#-troubleshooting)**
310310

311311
## 📖 Documentation
312312

313313
| Document | Description |
314314
|----------|-------------|
315-
| [**SUMMARY.md**](SUMMARY.md) | Complete curriculum overview and learning paths |
315+
| **README.md** | Main project documentation and getting started guide |
316316
| [**CONTRIBUTING.md**](CONTRIBUTING.md) | How to contribute to the project |
317317
| [**Docker Guide**](docker/README.md) | Complete Docker setup and usage |
318318
| [**Module READMEs**](modules/) | Individual module documentation |
@@ -327,13 +327,13 @@ We welcome contributions from the community! This project thrives on:
327327
- 🔧 **Optimizations**: Performance improvements and best practices
328328
- 🌐 **Platform Support**: Cross-platform compatibility improvements
329329

330-
**[📖 Contributing Guidelines →](CONTRIBUTING.md)****[🐛 Report Issues →](../../issues)****[💡 Request Features →](../../issues/new?template=feature_request.md)**
330+
**[📖 Contributing Guidelines →](CONTRIBUTING.md)****[🐛 Report Issues →](https://github.com/AIComputing101/gpu-programming-101/issues)****[💡 Request Features →](https://github.com/AIComputing101/gpu-programming-101/issues/new?template=feature_request.md)**
331331

332332
## 🏆 Community & Support
333333

334334
- 🌟 **Star this project** if you find it helpful!
335-
- 🐛 **Report bugs** using our [issue templates](../../issues/new/choose)
336-
- 💬 **Join discussions** in [GitHub Discussions](../../discussions)
335+
- 🐛 **Report bugs** using our [issue templates](https://github.com/AIComputing101/gpu-programming-101/issues/new/choose)
336+
- 💬 **Join discussions** in [GitHub Discussions](https://github.com/AIComputing101/gpu-programming-101/discussions)
337337
- 📧 **Get help** from the community and maintainers
338338

339339
## 📄 License
@@ -371,7 +371,7 @@ Stephen Shao, "GPU Programming 101: A Comprehensive Educational Project for CUDA
371371

372372
**Ready to unlock the power of GPU computing?**
373373

374-
**[🚀 Get Started Now](#-quick-start)****[📚 View Curriculum](SUMMARY.md)****[🐳 Try Docker](docker/README.md)**
374+
**[🚀 Get Started Now](#-quick-start)****[📚 View Modules](modules/)****[🐳 Try Docker](docker/README.md)**
375375

376376
---
377377

docker/cuda/Dockerfile

Lines changed: 84 additions & 100 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@ LABEL ubuntu.version="22.04"
1313
# Avoid interactive prompts during package installation
1414
ARG DEBIAN_FRONTEND=noninteractive
1515

16-
# Install essential development tools
16+
# Install essential development tools for GPU programming
1717
RUN apt-get update && apt-get install -y \
18-
# Basic development tools
18+
# Core development tools
1919
build-essential \
2020
cmake \
2121
git \
@@ -25,49 +25,33 @@ RUN apt-get update && apt-get install -y \
2525
nano \
2626
htop \
2727
tree \
28-
# Python development
28+
# Minimal Python for basic scripting (not data science)
2929
python3 \
3030
python3-pip \
3131
python3-dev \
3232
# Additional utilities
3333
pkg-config \
3434
software-properties-common \
35-
apt-transport-https \
36-
ca-certificates \
37-
gnupg \
38-
lsb-release \
39-
# GPU monitoring tools
35+
# GPU monitoring tools (installed but won't work during build)
4036
nvidia-utils-535 \
4137
# Debugging and profiling tools
4238
gdb \
4339
valgrind \
4440
strace \
45-
# Network tools for downloading samples
41+
# Network tools
4642
net-tools \
4743
iputils-ping \
4844
&& rm -rf /var/lib/apt/lists/*
4945

50-
# Install NVIDIA profiling tools (Nsight Systems, Compute) - Latest 2025 versions
51-
RUN apt-get update && apt-get install -y \
52-
nsight-systems-2025.1.1 \
53-
nsight-compute-2025.1.1 \
54-
&& rm -rf /var/lib/apt/lists/* || \
55-
# Fallback to 2024 versions if 2025 not available yet
56-
(apt-get update && apt-get install -y \
57-
nsight-systems-2024.6.1 \
58-
nsight-compute-2024.3.1 \
59-
&& rm -rf /var/lib/apt/lists/*)
60-
61-
# Install Python packages for data analysis and visualization
46+
# Install optional CUDA tools if available
47+
RUN apt-get update && \
48+
(apt-get install -y cuda-tools-12-9 || apt-get install -y cuda-tools || true) && \
49+
rm -rf /var/lib/apt/lists/*
50+
51+
# Install minimal Python packages for basic development (no heavy data science libs)
6252
RUN pip3 install --no-cache-dir \
6353
numpy \
64-
matplotlib \
65-
seaborn \
66-
pandas \
67-
jupyter \
68-
jupyterlab \
69-
plotly \
70-
scipy
54+
matplotlib
7155

7256
# Set up CUDA environment variables
7357
ENV PATH=/usr/local/cuda/bin:${PATH}
@@ -78,8 +62,8 @@ ENV CUDA_VERSION=12.9.1
7862
ENV NVIDIA_VISIBLE_DEVICES=all
7963
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility
8064

81-
# Verify CUDA installation
82-
RUN nvcc --version && nvidia-smi
65+
# Verify CUDA compiler installation (skip nvidia-smi as no GPU during build)
66+
RUN nvcc --version
8367

8468
# Create development workspace
8569
WORKDIR /workspace
@@ -98,85 +82,85 @@ RUN echo 'alias ll="ls -alF"' >> /root/.bashrc && \
9882
echo 'export PS1="\[\e[1;32m\][CUDA-DEV]\[\e[0m\] \w $ "' >> /root/.bashrc
9983

10084
# Create a simple GPU test script
101-
RUN cat > /workspace/test-gpu.sh << 'EOF'
102-
#!/bin/bash
103-
echo "=== GPU Programming 101 - CUDA Environment Test ==="
104-
echo "Date: $(date)"
105-
echo ""
106-
107-
echo "=== CUDA Compiler ==="
108-
nvcc --version
109-
echo ""
110-
111-
echo "=== GPU Information ==="
112-
nvidia-smi --query-gpu=name,memory.total,compute_cap,driver_version --format=csv
113-
echo ""
114-
115-
echo "=== CUDA Samples Test ==="
116-
if [ -d "/usr/local/cuda/samples" ]; then
117-
echo "CUDA samples directory found"
118-
else
119-
echo "CUDA samples not found - this is normal for newer CUDA versions"
120-
fi
121-
122-
echo "=== Environment Variables ==="
123-
echo "CUDA_HOME: $CUDA_HOME"
124-
echo "PATH: $PATH"
125-
echo "LD_LIBRARY_PATH: $LD_LIBRARY_PATH"
126-
echo ""
127-
128-
echo "=== Build Test ==="
129-
cd /tmp
130-
cat > test.cu << 'CUDA_EOF'
131-
#include <cuda_runtime.h>
132-
#include <stdio.h>
133-
134-
__global__ void hello() {
135-
printf("Hello from GPU thread %d!\n", threadIdx.x);
136-
}
137-
138-
int main() {
139-
printf("CUDA Test Program\n");
140-
hello<<<1, 5>>>();
141-
cudaDeviceSynchronize();
142-
printf("GPU kernel completed!\n");
143-
return 0;
144-
}
145-
CUDA_EOF
146-
147-
echo "Compiling test CUDA program..."
148-
if nvcc -o test test.cu; then
149-
echo "✓ Compilation successful"
150-
echo "Running test program:"
151-
./test
152-
echo "✓ CUDA environment is working correctly!"
153-
else
154-
echo "✗ Compilation failed"
155-
exit 1
156-
fi
157-
158-
rm -f test test.cu
159-
echo ""
160-
echo "=== All tests completed ==="
161-
EOF
85+
RUN printf '#!/bin/bash\n\
86+
echo "=== GPU Programming 101 - CUDA Environment Test ==="\n\
87+
echo "Date: $(date)"\n\
88+
echo ""\n\
89+
\n\
90+
echo "=== CUDA Compiler ==="\n\
91+
nvcc --version\n\
92+
echo ""\n\
93+
\n\
94+
echo "=== GPU Information ==="\n\
95+
if nvidia-smi --query-gpu=name,memory.total,compute_cap,driver_version --format=csv 2>/dev/null; then\n\
96+
echo "GPU detected successfully"\n\
97+
else\n\
98+
echo "No GPU detected or nvidia-smi not available"\n\
99+
fi\n\
100+
echo ""\n\
101+
\n\
102+
echo "=== Environment Variables ==="\n\
103+
echo "CUDA_HOME: $CUDA_HOME"\n\
104+
echo "PATH: $PATH"\n\
105+
echo "LD_LIBRARY_PATH: $LD_LIBRARY_PATH"\n\
106+
echo ""\n\
107+
\n\
108+
echo "=== Build Test ==="\n\
109+
cd /tmp\n\
110+
cat > test.cu << '"'"'CUDA_EOF'"'"'\n\
111+
#include <cuda_runtime.h>\n\
112+
#include <stdio.h>\n\
113+
\n\
114+
__global__ void hello() {\n\
115+
printf("Hello from GPU thread %%d!\\n", threadIdx.x);\n\
116+
}\n\
117+
\n\
118+
int main() {\n\
119+
printf("CUDA Test Program\\n");\n\
120+
\n\
121+
int deviceCount;\n\
122+
cudaError_t error = cudaGetDeviceCount(&deviceCount);\n\
123+
\n\
124+
if (error != cudaSuccess) {\n\
125+
printf("CUDA Error: %%s\\n", cudaGetErrorString(error));\n\
126+
printf("No CUDA-capable devices found\\n");\n\
127+
return 0;\n\
128+
}\n\
129+
\n\
130+
printf("Found %%d CUDA device(s)\\n", deviceCount);\n\
131+
hello<<<1, 5>>>();\n\
132+
cudaDeviceSynchronize();\n\
133+
printf("GPU kernel completed!\\n");\n\
134+
return 0;\n\
135+
}\n\
136+
CUDA_EOF\n\
137+
\n\
138+
echo "Compiling test CUDA program..."\n\
139+
if nvcc -o test test.cu; then\n\
140+
echo "✓ Compilation successful"\n\
141+
echo "Running test program:"\n\
142+
./test\n\
143+
echo "✓ CUDA environment is working correctly!"\n\
144+
else\n\
145+
echo "✗ Compilation failed"\n\
146+
exit 1\n\
147+
fi\n\
148+
\n\
149+
rm -f test test.cu\n\
150+
echo ""\n\
151+
echo "=== All tests completed ==="\n' > /workspace/test-gpu.sh
162152

163153
RUN chmod +x /workspace/test-gpu.sh
164154

165-
# Install additional CUDA samples and utilities
155+
# Install CUDA samples for learning and reference
166156
RUN cd /workspace && \
167157
git clone https://github.com/NVIDIA/cuda-samples.git && \
168158
cd cuda-samples && \
169159
git checkout v12.9
170160

171-
# Create jupyter kernel for CUDA (for notebooks)
172-
RUN python3 -m ipykernel install --name cuda-kernel --display-name "CUDA Python"
173-
174-
# Expose Jupyter port
175-
EXPOSE 8888
176-
177161
# Default command
178162
CMD ["/bin/bash"]
179163

180-
# Health check to verify GPU access
164+
# Health check to verify GPU access (will only work when GPU is available)
181165
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
182-
CMD nvidia-smi > /dev/null 2>&1 || exit 1
166+
CMD nvcc --version > /dev/null 2>&1 || exit 1

0 commit comments

Comments
 (0)