WeichenXu123
diff --git a/‎README.rst
Lines changed: 9 additions & 121 deletions b/‎README.rst
Lines changed: 9 additions & 121 deletions
diff --git a/‎docs/_static/custom.css
Lines changed: 46 additions & 0 deletions b/‎docs/_static/custom.css
Lines changed: 46 additions & 0 deletions
diff --git a/‎docs/conf.py
Lines changed: 1 addition & 0 deletions b/‎docs/conf.py
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/index.rst
Lines changed: 86 additions & 5 deletions b/‎docs/index.rst
Lines changed: 86 additions & 5 deletions
diff --git a/‎docs/keras.rst
Lines changed: 7 additions & 0 deletions b/‎docs/keras.rst
Lines changed: 7 additions & 0 deletions
diff --git a/‎docs/mpirun.rst
Lines changed: 4 additions & 4 deletions b/‎docs/mpirun.rst
Lines changed: 4 additions & 4 deletions
@@ -103,11 +103,19 @@ Concepts
 Horovod core principles are based on `MPI <http://mpi-forum.org/>`_ concepts such as *size*, *rank*,
 *local rank*, **allreduce**, **allgather** and, *broadcast*. See `this page <docs/concepts.rst>`_ for more details.
 
+Supported frameworks
+--------------------
+See these pages for Horovod examples and best practices:
+
+- `Horovod with TensorFlow <#usage>`__ (Usage section below)
+- `Horovod with Keras <docs/keras.rst>`_
+- `Horovod with PyTorch <docs/pytorch.rst>`_
+- `Horovod with MXNet <docs/mxnet.rst>`_
 
 Usage
 -----
 
-To use Horovod, make the following additions to your program:
+To use Horovod, make the following additions to your program. This example uses TensorFlow.
 
 1. Run ``hvd.init()``.
 
@@ -202,132 +210,12 @@ page for more instructions, including RoCE/InfiniBand tweaks and tips for dealin
 
 7. To run in Singularity, see `Singularity <https://github.com/sylabs/examples/tree/master/machinelearning/horovod>`_.
 
-Keras
------
-Horovod supports Keras and regular TensorFlow in similar ways.
-
-See full training `simple <https://github.com/horovod/horovod/blob/master/examples/keras_mnist.py>`_ and `advanced <https://github.com/horovod/horovod/blob/master/examples/keras_mnist_advanced.py>`_ examples.
-
-**Note**: Keras 2.0.9 has a `known issue <https://github.com/fchollet/keras/issues/8353>`_ that makes each worker allocate
-all GPUs on the server, instead of the GPU assigned by the *local rank*. If you have multiple GPUs per server, upgrade
-to Keras 2.1.2 or downgrade to Keras 2.0.8.
-
-
 Estimator API
 -------------
 Horovod supports Estimator API and regular TensorFlow in similar ways.
 
 See a full training `example <examples/tensorflow_mnist_estimator.py>`_.
 
-MXNet
------
-Horovod supports MXNet and regular TensorFlow in similar ways.
-
-See full training `MNIST <https://github.com/horovod/horovod/blob/master/examples/mxnet_mnist.py>`_ and `ImageNet <https://github.com/horovod/horovod/blob/master/examples/mxnet_imagenet_resnet50.py>`_ examples. The script below provides a simple skeleton of code block based on MXNet Gluon API.
-
-.. code-block:: python
-
-    import mxnet as mx
-    import horovod.mxnet as hvd
-    from mxnet import autograd
-
-    # Initialize Horovod
-    hvd.init()
-
-    # Pin GPU to be used to process local rank
-    context = mx.gpu(hvd.local_rank())
-    num_workers = hvd.size()
-
-    # Build model
-    model = ...
-    model.hybridize()
-
-    # Create optimizer
-    optimizer_params = ...
-    opt = mx.optimizer.create('sgd', **optimizer_params)
-
-    # Initialize parameters
-    model.initialize(initializer, ctx=context)
-
-    # Fetch and broadcast parameters
-    params = model.collect_params()
-    if params is not None:
-        hvd.broadcast_parameters(params, root_rank=0)
-
-    # Create DistributedTrainer, a subclass of gluon.Trainer
-    trainer = hvd.DistributedTrainer(params, opt)
-
-    # Create loss function
-    loss_fn = ...
-
-    # Train model
-    for epoch in range(num_epoch):
-        train_data.reset()
-        for nbatch, batch in enumerate(train_data, start=1):
-            data = batch.data[0].as_in_context(context)
-            label = batch.label[0].as_in_context(context)
-            with autograd.record():
-                output = model(data.astype(dtype, copy=False))
-                loss = loss_fn(output, label)
-            loss.backward()
-            trainer.step(batch_size)
-
-
-
-**Note**: The `known issue <https://github.com/horovod/horovod/issues/884>`__ when running Horovod with MXNet on a Linux system with GCC version 5.X and above has been resolved. Please use MXNet 1.4.1 or later releases with Horovod 0.16.2 or later releases to avoid the GCC incompatibility issue. MXNet 1.4.0 release works with Horovod 0.16.0 and 0.16.1 releases with the GCC incompatibility issue unsolved.
-
-PyTorch
--------
-Horovod supports PyTorch and TensorFlow in similar ways.
-
-Example (also see a full training `example <examples/pytorch_mnist.py>`__):
-
-.. code-block:: python
-
-    import torch
-    import horovod.torch as hvd
-
-    # Initialize Horovod
-    hvd.init()
-
-    # Pin GPU to be used to process local rank (one GPU per process)
-    torch.cuda.set_device(hvd.local_rank())
-
-    # Define dataset...
-    train_dataset = ...
-
-    # Partition dataset among workers using DistributedSampler
-    train_sampler = torch.utils.data.distributed.DistributedSampler(
-        train_dataset, num_replicas=hvd.size(), rank=hvd.rank())
-
-    train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=..., sampler=train_sampler)
-
-    # Build model...
-    model = ...
-    model.cuda()
-
-    optimizer = optim.SGD(model.parameters())
-
-    # Add Horovod Distributed Optimizer
-    optimizer = hvd.DistributedOptimizer(optimizer, named_parameters=model.named_parameters())
-
-    # Broadcast parameters from rank 0 to all other processes.
-    hvd.broadcast_parameters(model.state_dict(), root_rank=0)
-
-    for epoch in range(100):
-       for batch_idx, (data, target) in enumerate(train_loader):
-           optimizer.zero_grad()
-           output = model(data)
-           loss = F.nll_loss(output, target)
-           loss.backward()
-           optimizer.step()
-           if batch_idx % args.log_interval == 0:
-               print('Train Epoch: {} [{}/{}]\tLoss: {}'.format(
-                   epoch, batch_idx * len(data), len(train_sampler), loss.item()))
-
-
-**Note**: PyTorch support requires NCCL 2.2 or later. It also works with NCCL 2.1.15 if you are not using RoCE or InfiniBand.
-
 mpi4py
 ------
 Horovod supports mixing and matching Horovod collectives with other MPI libraries, such as `mpi4py <https://mpi4py.scipy.org>`_,
 
@@ -0,0 +1,46 @@
+/* Custom CSS for landing page accordion */
+
+/*Accordion open/close button style */
+.accordion {
+  background-color: #eee;
+  color: #444;
+  cursor: pointer;
+  padding: 18px;
+  width: 100%;
+  text-align: left;
+  border: none;
+  outline: none;
+  transition: 0.4s;
+  font-family: 'Helvetica Neue';
+  font-size: 16px;
+}
+
+/* Active button style*/
+.active, .accordion:hover {
+  background-color: #ccc;
+}
+
+/* Accordion panel style */
+.panel {
+  padding: 0 18px;
+  background-color: white;
+  max-height: 0;
+  overflow: hidden;
+  transition: max-height 0.2s ease-out;
+  font-size: 16px;
+}
+
+/* Plus sign */
+.accordion:after {
+  content: '\02795';
+  font-size: 16px;
+  color: #777;
+  float: right;
+  margin-left: 5px;
+  font-family: 'Helvetica Neue';
+}
+
+/* Minus sign */
+.active:after {
+  content: "\2796";
+}
@@ -91,6 +91,7 @@
     'github_count': 'true',
     'fixed_sidebar': True,
     'sidebar_collapse': True,
+    'font_family': 'Helvetica Neue'
 }
 
 # Add any paths that contain custom static files (such as style sheets) here,
 
@@ -2,26 +2,106 @@ Horovod documentation
 =====================
 Horovod improves the speed, scale, and resource utilization of deep learning training.
 
+Get started
+-----------
+Choose your deep learning framework to learn how to get started with Horovod.
+
+.. raw:: html
+
+    <button class="accordion">TensorFlow</button>
+    <div class="panel">
+      <p>To use Horovod with TensorFlow on your laptop:
+         <ol>
+            <li><a href="https://www.open-mpi.org/faq/?category=building#easy-build">Install Open MPI 3.1.2 or 4.0.0</a>, or another MPI implementation. </li>
+            <li>Install the Horovod pip package: <code>pip install horovod</code></li>
+            <li>Read <a href="https://horovod.readthedocs.io/en/latest/tensorflow.html">Horovod with TensorFlow</a> for best practices and examples. </li>
+         </ol>
+         Or, use <a href="https://horovod.readthedocs.io/en/latest/gpus_include.html">Horovod on GPUs</a>, in <a href="https://horovod.readthedocs.io/en/latest/spark_include.html">Spark</a>, <a href="https://horovod.readthedocs.io/en/latest/docker_include.html">Docker</a>, <a href="https://github.com/sylabs/examples/tree/master/machinelearning/horovod">Singularity</a>, or Kubernetes (<a href="https://github.com/kubeflow/kubeflow/tree/master/kubeflow/mpi-job">Kubeflow</a>, <a href="https://github.com/kubeflow/mpi-operator/">MPI Operator</a>, <a href="https://github.com/helm/charts/tree/master/stable/horovod">Helm Chart</a>, and <a href="https://github.com/IBM/FfDL/tree/master/etc/examples/horovod/">FfDL</a>).
+      </p>
+    </div>
+
+    <button class="accordion">Keras</button>
+    <div class="panel">
+      <p>To use Horovod with Keras on your laptop:
+         <ol>
+            <li><a href="https://www.open-mpi.org/faq/?category=building#easy-build">Install Open MPI 3.1.2 or 4.0.0</a>, or another MPI implementation. </li>
+            <li>Install the Horovod pip package: <code>pip install horovod</code></li>
+            <li>Read <a href="https://horovod.readthedocs.io/en/latest/keras.html">Horovod with Keras</a> for best practices and examples. </li>
+         </ol>
+         Or, use <a href="https://horovod.readthedocs.io/en/latest/gpus_include.html">Horovod on GPUs</a>, in <a href="https://horovod.readthedocs.io/en/latest/spark_include.html">Spark</a>, <a href="https://horovod.readthedocs.io/en/latest/docker_include.html">Docker</a>, <a href="https://github.com/sylabs/examples/tree/master/machinelearning/horovod">Singularity</a>, or Kubernetes (<a href="https://github.com/kubeflow/kubeflow/tree/master/kubeflow/mpi-job">Kubeflow</a>, <a href="https://github.com/kubeflow/mpi-operator/">MPI Operator</a>, <a href="https://github.com/helm/charts/tree/master/stable/horovod">Helm Chart</a>, and <a href="https://github.com/IBM/FfDL/tree/master/etc/examples/horovod/">FfDL</a>).
+      </p>
+    </div>
+
+    <button class="accordion">PyTorch</button>
+    <div class="panel">
+      <p>To use Horovod with PyTorch on your laptop:
+         <ol>
+            <li><a href="https://www.open-mpi.org/faq/?category=building#easy-build">Install Open MPI 3.1.2 or 4.0.0</a>, or another MPI implementation. </li>
+            <li>Install the Horovod pip package: <code>pip install horovod</code></li>
+            <li>Read <a href="https://horovod.readthedocs.io/en/latest/pytorch.html">Horovod with PyTorch</a> for best practices and examples. </li>
+         </ol>
+         Or, use <a href="https://horovod.readthedocs.io/en/latest/gpus_include.html">Horovod on GPUs</a>, in <a href="https://horovod.readthedocs.io/en/latest/spark_include.html">Spark</a>, <a href="https://horovod.readthedocs.io/en/latest/docker_include.html">Docker</a>, <a href="https://github.com/sylabs/examples/tree/master/machinelearning/horovod">Singularity</a>, or Kubernetes (<a href="https://github.com/kubeflow/kubeflow/tree/master/kubeflow/mpi-job">Kubeflow</a>, <a href="https://github.com/kubeflow/mpi-operator/">MPI Operator</a>, <a href="https://github.com/helm/charts/tree/master/stable/horovod">Helm Chart</a>, and <a href="https://github.com/IBM/FfDL/tree/master/etc/examples/horovod/">FfDL</a>).
+      </p>
+    </div>
+
+    <button class="accordion">Apache MXNet</button>
+    <div class="panel">
+      <p>To use Horovod with Apache MXNet on your laptop:
+         <ol>
+            <li><a href="https://www.open-mpi.org/faq/?category=building#easy-build">Install Open MPI 3.1.2 or 4.0.0</a>, or another MPI implementation. </li>
+            <li>Install the Horovod pip package: <code>pip install horovod</code></li>
+            <li>Read <a href="https://horovod.readthedocs.io/en/latest/mxnet.html">Horovod with MXNet</a> for best practices and examples. </li>
+         </ol>
+         Or, use <a href="https://horovod.readthedocs.io/en/latest/gpus_include.html">Horovod on GPUs</a>, in <a href="https://horovod.readthedocs.io/en/latest/spark_include.html">Spark</a>, <a href="https://horovod.readthedocs.io/en/latest/docker_include.html">Docker</a>, <a href="https://github.com/sylabs/examples/tree/master/machinelearning/horovod">Singularity</a>, or Kubernetes (<a href="https://github.com/kubeflow/kubeflow/tree/master/kubeflow/mpi-job">Kubeflow</a>, <a href="https://github.com/kubeflow/mpi-operator/">MPI Operator</a>, <a href="https://github.com/helm/charts/tree/master/stable/horovod">Helm Chart</a>, and <a href="https://github.com/IBM/FfDL/tree/master/etc/examples/horovod/">FfDL</a>).
+      </p>
+    </div>
+
+    <script>
+        var acc = document.getElementsByClassName("accordion");
+        var i;
+
+        for (i = 0; i < acc.length; i++) {
+          acc[i].addEventListener("click", function() {
+            this.classList.toggle("active");
+            var panel = this.nextElementSibling;
+            if (panel.style.maxHeight){
+              panel.style.maxHeight = null;
+            } else {
+              panel.style.maxHeight = panel.scrollHeight + "px";
+            }
+          });
+        }
+     </script>
+
+Guides
+------
+
 .. toctree::
    :maxdepth: 2
 
    summary_include
 
-   mpirun
+   concepts_include
 
    api
 
-   concepts_include
+   tensorflow
+
+   keras
+
+   pytorch
+
+   mxnet
 
    running_include
 
    benchmarks_include
 
-   docker_include
+   inference_include
 
    gpus_include
 
-   inference_include
+   docker_include
 
    spark_include
 
@@ -31,8 +111,9 @@ Horovod improves the speed, scale, and resource utilization of deep learning tra
 
    troubleshooting_include
 
+
 Indices and tables
-==================
+------------------
 
 * :ref:`genindex`
 * :ref:`modindex`
 
@@ -0,0 +1,7 @@
+Horovod with Keras
+==================
+Horovod supports Keras and regular TensorFlow in similar ways.
+
+See full training `simple <https://github.com/horovod/horovod/blob/master/examples/keras_mnist.py>`_ and `advanced <https://github.com/horovod/horovod/blob/master/examples/keras_mnist_advanced.py>`_ examples.
+
+.. NOTE:: Keras 2.0.9 has a `known issue <https://github.com/fchollet/keras/issues/8353>`_ that makes each worker allocate all GPUs on the server, instead of the GPU assigned by the *local rank*. If you have multiple GPUs per server, upgrade to Keras 2.1.2 or downgrade to Keras 2.0.8.
@@ -1,6 +1,7 @@
-Running Horovod with Open MPI
-=============================
+:orphan:
 
+Run Horovod with Open MPI
+=========================
 ``horovodrun`` introduces a convenient, Open MPI-based wrapper for running Horovod scripts.
 
 In some cases it is desirable to have fine-grained control over options passed to Open MPI.  This page describes
@@ -10,14 +11,13 @@ running Horovod training directly using Open MPI.
 
    .. code-block:: bash
 
-       horovodrun -np 4 -H localhost:4 python train.py
+       horovodrun -np 4 python train.py
 
    Equivalent Open MPI command:
 
    .. code-block:: bash
 
        mpirun -np 4 \
-           -H localhost:4 \
            -bind-to none -map-by slot \
            -x NCCL_DEBUG=INFO -x LD_LIBRARY_PATH -x PATH \
            -mca pml ob1 -mca btl ^openib \
Original file line number	Diff line number	Diff line change
`@@ -91,6 +91,7 @@`
`91`	`91`	`'github_count': 'true',`
`92`	`92`	`'fixed_sidebar': True,`
`93`	`93`	`'sidebar_collapse': True,`
	`94`	`+ 'font_family': 'Helvetica Neue'`
`94`	`95`	`}`
`95`	`96`
`96`	`97`	`# Add any paths that contain custom static files (such as style sheets) here,`