It is possible to avoid the step that copies xnew back into xlocal either by using pointer operations (basically swapping xnew and xlocal) or by doing two iterations at a time. This latter approach requires another exchange of ghost points and another relaxation sweep.